A Decade of Knowledge Graphs in Natural Language Processing: A Survey
Phillip Schneider1, Tim Schopf1, Juraj Vladika1, Mikhail Galkin2,
Elena Simperl3and Florian Matthes1
1Technical University of Munich, Department of Computer Science, Germany
2Mila Quebec AI Institute & McGill University, School of Computer Science, Canada
3King’s College London, Department of Informatics, United Kingdom
{phillip.schneider, tim.schopf, juraj.vladika, matthes}@tum.de
mikhail.galkin@mila.quebec
elena.simperl@kcl.ac.uk
Abstract
In pace with developments in the research field
of artificial intelligence, knowledge graphs
(KGs) have attracted a surge of interest from
both academia and industry. As a represen-
tation of semantic relations between entities,
KGs have proven to be particularly relevant for
natural language processing (NLP), experienc-
ing a rapid spread and wide adoption within
recent years. Given the increasing amount of
research work in this area, several KG-related
approaches have been surveyed in the NLP re-
search community. However, a comprehen-
sive study that categorizes established topics
and reviews the maturity of individual research
streams remains absent to this day. Contribut-
ing to closing this gap, we systematically ana-
lyzed 507 papers from the literature on KGs in
NLP. Our survey encompasses a multifaceted
review of tasks, research types, and contribu-
tions. As a result, we present a structured
overview of the research landscape, provide
a taxonomy of tasks, summarize our findings,
and highlight directions for future work.
1 Introduction
Knowledge acquisition and application are inher-
ent to natural language. Humans use language as a
means of communicating facts, arguing about de-
cisions, or questioning beliefs. Therefore, it is not
surprising that computational linguists started al-
ready in the 1950s and 60s to work out ideas on how
to represent knowledge as relations between con-
cepts in semantic networks (Richens,1956;Quil-
lian,1963;Collins and Quillian,1969).
More recently, knowledge graphs (KGs) have
emerged as an approach for semantically repre-
senting knowledge about real-world entities in a
machine-readable format. They originated from
research on semantic networks, domain-specific
ontologies, as well as linked data, and are thus not
an entirely new concept (Hitzler,2021). Despite
their growing popularity, there is still no general
understanding of what exactly a
KG
is or for what
tasks it is applicable. Although prior work has al-
ready attempted to define KGs (Pujara et al.,2013;
Ehrlinger and Wöß,2016;Paulheim,2017;Färber
et al.,2018), the term is not yet used uniformly by
researchers. Most studies implicitly adopt a broad
definition of KGs, where they are understood as "a
graph of data intended to accumulate and convey
knowledge of the real world, whose nodes represent
entities of interest and whose edges represent rela-
tions between these entities" (Hogan et al.,2022).
KGs have attracted a lot of research attention
in both academia and industry since the introduc-
tion of Google’s KG in 2012 (Singhal,2012). Par-
ticularly in natural language processing (
NLP
) re-
search, the adoption of KGs has become increas-
ingly popular over the past 5 years, and this trend
seems to be accelerating. The underlying paradigm
is that the combination of structured and unstruc-
tured knowledge can benefit all kinds of
NLP
tasks.
For instance, structured knowledge from KGs can
be injected into that of the contextual knowledge
found in language models, which improves the per-
formance in downstream tasks (Colon-Hernandez
et al.,2021). Furthermore, with the growing impor-
tance of KGs, there are also expanding efforts to
construct new KGs from unstructured texts.
Ten years after Google coined the term knowl-
edge graph in 2012, a plethora of novel approaches
has been proposed by scholars. Therefore, it is im-
portant to assemble insights, consolidate existing
results, and provide a structured overview. How-
ever, to our knowledge, there are no studies that
offer an overview of the whole research landscape
of KGs in the
NLP
field. Contributing to closing
this gap, we performed a comprehensive survey
to analyze all research performed in this area by
classifying established topics, identifying trends,
and outlining areas for future research. Our three
main contributions are as follows:
arXiv:2210.00105v1 [cs.CL] 30 Sep 2022