2 B. Zhao et al.
is a fundamental research with variable downstream applications, including edu-
cational data mining [22], learning management systems [1], question answering
platforms [30], dialogue systems [18], etc.
Plentiful educational KGs have been proposed to help the development of
computer-aided educational technologies. KnowEdu [4] and K12Edukg [3] are
constructed by extracting concepts and prerequisite rules from subject-specific
textbooks. However, entities in these KGs are only course concepts without
other essential educational resources for students. CKGG [23] is proposed based
on Chinese high-school-level geography education, yet they only integrate data
for location entities. Meanwhile, several educational KGs are proposed based
on massive online open courses (MOOCs). For instance, MOOC-KG [6] and
HEKG [34] are built upon open course data, yet their ontology can only repre-
sent subject-specific knowledge at a shallow level. In particular, there are only 4
and 6 defined classes in MOOC-KG and HEKG, respectively. Furthermore, al-
though KGs built upon open courses consist of heterogeneous data, they cannot
dynamically develop with growing resources. Additionally, most KGs based on
MOOCs are designed for higher education instead of K-12 education.
Despite that several KGs have been proposed for educational usage, they
suffer from the following limitations:
•Insufficient Knowledge Modeling. Prior research pointed out that in-
terdisciplinary teaching is beneficial for developing students’ critical thinking,
creativity, communication, and essential academia [12]. In the meantime, fine
knowledge granularity is also beneficial for students’ learning process [25]. Nev-
ertheless, existing educational KGs only represent subject-specific knowledge on
a coarse-grained level, lacking interdisciplinary entity relations.
•Sophisticated Data Curation. Education aims to teach students with
broad ability instead of just knowledge in textbooks [28]. Educational resources,
including examination questions and beyond, are proved to be beneficial for
fostering students’ abilities through learning by doing [2,24]. Also, existing data
repositories for education, such as MOOCCUBEX [32] leverages a concept graph
to organize heterogeneous data altogether. However, existing educational KGs
still lack adequate resources due to data heterogeneity.
•Neglected Information Growth. Information for education is ever-growing
from both knowledge and data perspectives. For knowledge, the educational
reform in China is consistently changing the essential knowledge of education
through time. For data, there are increasing online materials for students to
learn. Nonetheless, prior educational KGs lack maintenance sustainability, i.e.,
the ability to capture and infuse new knowledge and resources incrementally.
To address these issues, we conclude that educational KGs should be built
with an interdisciplinary schema that can represent not only knowledge but also
resources. Meanwhile, towards maintaining sustainability, an educational KG
should be able to grow and adapt incrementally to the change of real-world
knowledge. Therefore, we propose EDUKG, a heterogeneous sustainable K-12
Educational Knowledge Graph for Chinese high-school-level education. We de-
sign an interdisciplinary fine-grained ontology that uniformly models knowledge,