EDUKG a Heterogeneous Sustainable K-12 Educational Knowledge Graph Bowen Zhao1 Jiuding Sun23 Bin Xu13 Xingyu Lu3 Yuchen Li3 Jifan

2025-05-03 0 0 4.3MB 17 页 10玖币
侵权投诉
EDUKG: a Heterogeneous Sustainable K-12
Educational Knowledge Graph
Bowen Zhao1?, Jiuding Sun2,3?, Bin Xu1,3( ), Xingyu Lu3, Yuchen Li3, Jifan
Yu3, Minghui Liu3, Tingjian Zhang3, Qiuyang Chen3, Hanming Li3, Lei Hou3,
and Juanzi Li3
1Global Innovation Exchange, Tsinghua University, Beijing, China
{zhaobw21,sjd22}@mails.tsinghua.edu.cn
2Khoury College of Computer Sciences, Northeastern University, Boston MA, USA
3Department of Computer Science and Technology,
Tsinghua University, Beijing, China
xubin@tsinghua.edu.cn
Abstract. Web and artificial intelligence technologies, especially se-
mantic web and knowledge graph (KG), have recently raised significant
attention in educational scenarios. Nevertheless, subject-specific KGs for
K-12 education still lack sufficiency and sustainability from knowledge
and data perspectives. To tackle these issues, we propose EDUKG, a het-
erogeneous sustainable K-12 Educational Knowledge Graph. We first
design an interdisciplinary and fine-grained ontology for uniformly mod-
eling knowledge and resource in K-12 education, where we define 635
classes, 445 object properties, and 1314 datatype properties in total.
Guided by this ontology, we propose a flexible methodology for inter-
actively extracting factual knowledge from textbooks. Furthermore, we
establish a general mechanism based on our proposed generalized entity
linking system for EDUKG’s sustainable maintenance, which can dynam-
ically index numerous heterogeneous resources and data with knowledge
topics in EDUKG. We further evaluate EDUKG to illustrate its suffi-
ciency, richness, and variability. We publish EDUKG with more than
252 million entities and 3.86 billion triplets. Our code and data reposi-
tory is now available at https://github.com/THU-KEG/EDUKG.
Keywords: Ontology ·Knowledge Graph ·K-12 Education
1 Introduction
The object of education is to prepare the young to educate themselves through-
out their lives, as said by Robert M. Hutchins. Education, especially for K-12
children, plays a significant role in everyone’s life. Intelligent education, which
aims to leverage the Web and artificial intelligence (AI) technologies to improve
students’ learning efficiency [13,19], has always been an essential topic for re-
searchers. In addition, the construction of educational knowledge graphs (KGs)
?Equal Contributions.
arXiv:2210.12228v1 [cs.CL] 21 Oct 2022
2 B. Zhao et al.
is a fundamental research with variable downstream applications, including edu-
cational data mining [22], learning management systems [1], question answering
platforms [30], dialogue systems [18], etc.
Plentiful educational KGs have been proposed to help the development of
computer-aided educational technologies. KnowEdu [4] and K12Edukg [3] are
constructed by extracting concepts and prerequisite rules from subject-specific
textbooks. However, entities in these KGs are only course concepts without
other essential educational resources for students. CKGG [23] is proposed based
on Chinese high-school-level geography education, yet they only integrate data
for location entities. Meanwhile, several educational KGs are proposed based
on massive online open courses (MOOCs). For instance, MOOC-KG [6] and
HEKG [34] are built upon open course data, yet their ontology can only repre-
sent subject-specific knowledge at a shallow level. In particular, there are only 4
and 6 defined classes in MOOC-KG and HEKG, respectively. Furthermore, al-
though KGs built upon open courses consist of heterogeneous data, they cannot
dynamically develop with growing resources. Additionally, most KGs based on
MOOCs are designed for higher education instead of K-12 education.
Despite that several KGs have been proposed for educational usage, they
suffer from the following limitations:
Insufficient Knowledge Modeling. Prior research pointed out that in-
terdisciplinary teaching is beneficial for developing students’ critical thinking,
creativity, communication, and essential academia [12]. In the meantime, fine
knowledge granularity is also beneficial for students’ learning process [25]. Nev-
ertheless, existing educational KGs only represent subject-specific knowledge on
a coarse-grained level, lacking interdisciplinary entity relations.
Sophisticated Data Curation. Education aims to teach students with
broad ability instead of just knowledge in textbooks [28]. Educational resources,
including examination questions and beyond, are proved to be beneficial for
fostering students’ abilities through learning by doing [2,24]. Also, existing data
repositories for education, such as MOOCCUBEX [32] leverages a concept graph
to organize heterogeneous data altogether. However, existing educational KGs
still lack adequate resources due to data heterogeneity.
Neglected Information Growth. Information for education is ever-growing
from both knowledge and data perspectives. For knowledge, the educational
reform in China is consistently changing the essential knowledge of education
through time. For data, there are increasing online materials for students to
learn. Nonetheless, prior educational KGs lack maintenance sustainability, i.e.,
the ability to capture and infuse new knowledge and resources incrementally.
To address these issues, we conclude that educational KGs should be built
with an interdisciplinary schema that can represent not only knowledge but also
resources. Meanwhile, towards maintaining sustainability, an educational KG
should be able to grow and adapt incrementally to the change of real-world
knowledge. Therefore, we propose EDUKG, a heterogeneous sustainable K-12
Educational Knowledge Graph for Chinese high-school-level education. We de-
sign an interdisciplinary fine-grained ontology that uniformly models knowledge,
EDUKG: a Heterogeneous Sustainable K-12 Educational Knowledge Graph 3
Fig. 1: EDUKG’s data sufficiency compared with other KGs. Blue, yellow, red,
and green points refer to knowledge and resources in interdisciplinary, geography,
politics, and history subjects, respectively.
resources, and heterogeneous data. In total, we define 635 classes and 1759 prop-
erties in EDUKG ontology without subject boundaries. Guided by this ontology,
we propose a semi-automated method for interactively acquiring knowledge from
textbooks. Fig. 1 compares data in EDUKG with Wikidata1and CKGG, indi-
cating that EDUKG consists of most sufficient information from both knowledge
and data perspectives. Additionally, for sustainably maintaining EDUKG with
growing data, we propose a general mechanism to index heterogeneous online
data incrementally based on our proposed entity linking technique.
Contributions. In general, our contributions are summarized as follows:
1. An interdisciplinary, fine-grained ontology uniformly represents K-12 educa-
tional knowledge, resources, and heterogeneous data with 635 classes, 445
object properties, and 1314 datatype properties;
2. A large-scale, heterogeneous K-12 educational KG with more than 252 mil-
lion entities and 3.86 billion triplets based on the data from massive educa-
tional and external resources;
3. A flexible and sustainable construction and maintenance mechanism empow-
ers EDUKG to evolve dynamically, where we design guiding schema of the
construction methodology as hot-swappable, and we simultaneously monitor
32 different data sources for incrementally infusing heterogeneous data.
Outline. In the following sections, we first illustrate the ontology for EDUKG
in Sec. 2, and we present EDUKG construction and maintenance mechanisms
in Sec. 3. Afterward, in Sec. 4, we introduce essential characteristics of EDUKG
to prove its sufficient qualities. In Sec. 5, we present the impact and availability
of EDUKG with its data, code, and applications. Finally, the related works are
introduced in Sec. 6, and we conclude our paper in Sec. 7
2 Schema of EDUKG
In this section, we introduce the ontology of EDUKG, which uniformly represents
knowledge, resource, and heterogeneous data.
1https://www.wikidata.org
4 B. Zhao et al.
Fig. 2: An overview of EDUKG top-level ontology.
2.1 Overview of EDUKG Ontology
As shown in Fig. 2, we divide EDUKG ontology into three main sections, which
are “Knowledge Topic”, “Educational Resource”, and “External Heterogeneous
Data”. Here we define the three top-level classes in EDUKG as follows:
“Knowledge Topic”: essential themes in some specific subjects [10] and their
essential rhetorical roles.
“Educational Resource”: intra-curricular teaching and testing resources in
K-12 education, for example, textbooks and examination exercises.
“External Heterogeneous Data”: extra-curricular resources and data give
students vast approaches to learning more comprehensive knowledge.
Since EDUKG contains educational knowledge and resources, we investi-
gate and follow multiple published knowledge and resource modeling standards.
For knowledge, we reuse vocabularies from the widely-adopted RDF and RDFS
摘要:

EDUKG:aHeterogeneousSustainableK-12EducationalKnowledgeGraphBowenZhao1?,JiudingSun2;3?,BinXu1;3(),XingyuLu3,YuchenLi3,JifanYu3,MinghuiLiu3,TingjianZhang3,QiuyangChen3,HanmingLi3,LeiHou3,andJuanziLi31GlobalInnovationExchange,TsinghuaUniversity,Beijing,Chinafzhaobw21,sjd22g@mails.tsinghua.edu.cn2Khour...

展开>> 收起<<
EDUKG a Heterogeneous Sustainable K-12 Educational Knowledge Graph Bowen Zhao1 Jiuding Sun23 Bin Xu13 Xingyu Lu3 Yuchen Li3 Jifan.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:4.3MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注