Transforming RDF-star to Property Graphs A Preliminary Analysis of Transformation Approaches extended version

2025-05-01 0 0 1.39MB 40 页 10玖币
侵权投诉
Transforming RDF-star to Property Graphs:
A Preliminary Analysis of Transformation
Approaches – extended version
Ghadeer Abuoda1, Daniele Dell’Aglio1, Arthur Keen2, and Katja Hose1
1Department of Computer Science, Aalborg University, Aalborg, Denmark
{gsmas,dade,khose}@cs.aau.dk
2ArangoDB, San Francisco, United States
arthur@arangodb.com
Abstract. RDF and property graph models have many similarities,
such as using basic graph concepts like nodes and edges. However, such
models differ in their modeling approach, expressivity, serialization, and
the nature of applications. RDF is the de-facto standard model for knowl-
edge graphs on the Semantic Web and supported by a rich ecosystem
for inference and processing. The property graph model, in contrast,
provides advantages in scalable graph analytical tasks, such as graph
matching, path analysis, and graph traversal. RDF-star extends RDF
and allows capturing metadata as a first-class citizen. To tap on the ad-
vantages of alternative models, the literature proposes different ways of
transforming knowledge graphs between property graphs and RDF. How-
ever, most of these approaches cannot provide complete transformations
for RDF-star graphs. Hence, this paper provides a step towards trans-
forming RDF-star graphs into property graphs. In particular, we identify
different cases to evaluate transformation approaches from RDF-star to
property graphs. Specifically, we categorize two classes of transforma-
tion approaches and analyze them based on the test cases. The obtained
insights will form the foundation for building complete transformation
approaches in the future.
1 Introduction
The most popular models for representing knowledge graphs are: RDF1(Re-
source Description Framework) and property graphs [27] (PG). While RDF rep-
resents knowledge graphs as a set of subject-predicate-object triples, property
graphs assign key-value style properties to nodes and edges. Recently, RDF-
star [18] has been proposed as an extension of RDF to enable enriching RDF
triples with metadata information by embedding triples in subjects or objects of
other triples, which allows providing statements about statements and somewhat
resembles adding properties to edges in property graphs. RDF-star is supported
by a rich ecosystem of data management systems and standards, most notably
1RDF 1.1 Primer: https://www.w3.org/TR/rdf11-primer/
arXiv:2210.05781v1 [cs.DB] 11 Oct 2022
systems such as Stardog, OpenLink’s Virtuoso, Ontotext GraphDB, Allegro-
Graph, Apache Jena, and more recently also Oxigraph, but also query standards,
such as SPARQL2and its extension SPARQL-star3as well as RDF Schema,
which allows describing classes of RDF resources and properties4. In contrast,
many graph database systems, such as Neo4j, TigerGraph, JanusGraph, Redis-
Graph, and SAP HANA are based on different variations of the property graph
model [31] and different query languages [9,10]. Unfortunately, RDF-star graphs
and property graphs are not entirely compatible with one another. Although they
both describe data through graphs, their underlying models and semantics are
different, leading to many data interoperability issues [3, 22]. Metadata or edge
properties in RDF-star can be modeled as separate nodes or RDF-star triples.
In contrast, edge properties can only be represented as literal key-value pairs in
property graphs. In general, it is challenging to transform an RDF-star graph
fully into a property graph because of the rich expressiveness of the former. The
heterogeneity between the two models and their frameworks makes it necessary
to study their interoperability, i.e., the ability to map one model to another for
data exchange and sharing [3].
The mapping between the two models is crucial for data exchange, data in-
tegration as well as reusability of systems and tools between the frameworks.
RDF-star, specifically the RDF model, is recognized as a web-native model that
supports data exchange and sharing across different sources because of its formal
semantics and the universal uniqueness of resources using IRIs. RDF is a com-
mon and flexible model for knowledge representation, and that is exemplified
by knowledge graphs that cover a broad set of domains, such as DBpedia [23],
YAGO [30], and Wikidata [33]. On the contrary, even with the wide adoption
of property graph engines, property graphs lack many essential features, such as
a schema language, a standard query language, standard data serialization for-
mats, etc. Achieving interoperability and reliable transformations between the
two frameworks will finally enable us to exploit the benefits of both models.
The transformation of property graphs to RDF-star has been explored re-
cently [21], and basic transformation rules for property graphs to RDF-star were
proposed [4,18]. However, the latter does not cover all RDF-star constructs and
allows for multiple alternatives.
Listing 1.1: An example RDF-star graph in Turtle star format
@prefix ex: < htt p :/ / e xa mpl e . or g /> .
<< ex:Alex ex:a ge 25 > > ex: c e rtai nty 0.5 .
Consider, for instance, the example illustrated in Figure 1(a). If we start
with the triple (ex:Alex,ex:age,25), then we could represent the RDF ele-
ment (Alex) as a node in a property graph, as shown in Figure 1(b). This node
would then have a property (age,25) and the RDF triple would be represented
2SPARQL 1.1 Query Language: https://www.w3.org/TR/sparql11-query/
3SPARQL-star Query Language: https://w3c.github.io/rdf-star/cg-spec/editors_draft.html#
sparql-star
4RDF Schema 1.1: https://www.w3.org/TR/rdf-schema/
2
:California
Alex
:Apple_Inc
:CEO
certainty
:start_date
25
0.5
age
Alex
(age = 25)
(a) RDF-star Graph
:California
Alex
:Apple_Inc
:CEO
certainty
:start_date
25
0.5
age
Alex
(age = 25)
(b) A property graph
Fig. 1: Graphical Representation of Listing 1.1 as (a) RDF-star and (b) Property
Graph
by a single node in a property graph. However, if we have a single node with-
out an edge, we cannot represent the metadata about the original RDF triple
(ex:certainty 0.5) in the property graph. Studying such cases, this paper makes
the following contributions:
We identify two alternative approaches of transformations: RDF-topology-
preserving and Property-Graph transformation.
We define a set of test cases capturing the diverse RDF-star constructs that
have to be considered when transforming RDF-star to property graphs.
Using the test cases, we systematically evaluate alternative mapping ap-
proaches and identify their shortcomings.
This paper is structured as follows: while Section 2 introduces preliminaries,
Section 3 discusses related work. Section 4 presents alternative transformation
approaches. Afterwards, Section 5 provides details on our test cases, which we use
in Section 6 to identify and discuss shortcomings of transformation approaches.
Section 7 concludes the work with an outlook for future work.
2 Preliminaries
In this section, we formally introduce RDF1, RDF-star [18], and property
graphs [27].
Resource Description Framework (RDF). RDF is a W3C standard data
model that represents information as a set of statements. Each statement denotes
a typed relation between two resources.
Definition 1 (RDF statement). Let I,Band Lbe the disjoint sets of In-
ternationalized Resource Identifiers (IRIs), blank nodes and literals. An RDF
statement is a triple (s, p, o)(IB)×I×(IBL), and it indicates that s
and o(subject and object, resp.) are in a relation p(predicate).
In this paper, we consider two types of RDF statements that we distinguish
based on whether the object is an IRI or a literal. Object property statements are
RDF statements (s, p, o)(IB)×I×(IB), while datatype property statements
3
:California
:Tim_Cook
:Apple_Inc
:CEO
:located_in
:start_date
2011
(a) RDF Graph
:California
:Tim_Cook
:Apple_Inc
:CEO
:located_in
:start_date
2011
(b) RDF-star Graph
:California
:Tim_Cook
:Apple_Inc :CEO
:located_in
(start_date= 2011)
(c) Property Graph
Fig. 2: Example in RDF, RDF-star, and Property Graphs
are RDF statements (s, p, o)(IB)×I×L. An RDF graph containing three
RDF statements is shown in Listing 1.2 serialized in Turtle5, and visually in
Figure 2(a) as a graph. The first two statements are object property statements.
The first statement describes two resources, ex:Apple_Inc and ex:California,
related by the predicate ex:located_in. The second statements indicates that
ex:Apple_Inc has ex:Tim_Cook as a ex:CEO. The last statement is a datatype
property statement, and it indicates that ex:Tim_Cook has the literal "2011" as
the value of the ex:start_date predicate.
Listing 1.2: An RDF graph in Turtle format
@prefix ex: < htt p :/ / e xa mpl e . or g /> .
ex: Apple_Inc ex: l ocat ed_i n ex: Cali forn i a .
ex: Apple_Inc ex:CEO ex:Ti m_C ook .
ex: T i m_C ook ex:s t art _ date 2011 .
RDF-star Looking at the RDF graph in Figure 2(a), one can spot some im-
precise data modelling choices: stating that Tim Cook started in 2011 is not
totally correct, as he started in 2011 his role as CEO of Apple. In other words,
one should associate the starting date to the statement (ex:Apple_Inc,ex:CEO,
ex:Tim_Cook), as depicted in Figure 2(b). There are several ways to imple-
ment this idea in RDF, such as RDF reification [18], singleton properties [25],
and named graphs [7]. However, these mechanisms have significant shortcom-
ings [11, 16, 19, 26].
Listing 1.3: An RDF-Star Graph in Turtle-Star Format
@prefix ex: < htt p :/ / e xa mpl e . or g /> .
ex: Apple_Inc ex: l ocat ed_i n ex: Cali forn i a .
<< ex: Apple_Inc ex: CEO ex: Tim_Cook >> ex: s t art_ d ate 2011 .
A solution to overcome such shortcomings was recently proposed by Hartig et
al. with RDF-star [16,18]. RDF-star extends RDF by letting RDF statements be
subjects or objects in other statements. Listing 1.3 shows an RDF-star document
serialized in Turtle-star [18]. The first statement is a compliant RDF statement
5RDF Turtle: https://www.w3.org/TR/turtle/
4
(it appeared also in Listing 1.2). The second statement indicates that ex:Apple_
Inc appointed ex:Tim_Cook as ex:CEO in 2011. We formally define an RDF-star
statement as follows.
Definition 2 (RDF-star statement). Let sIB,pI,oIBL.
An RDF-star statement is a triple defined recursively as:
Any RDF statement (s, p, o)is an RDF-star statement;
Let tand ¯
tbe RDF-star statements. Then, (t, p, o),(s, p, t)and (t, p, ¯
t)are
RDF-star statements, also known as asserted statement.tand ¯
tare called
embedded or quoted statements.
Property Graphs A property graph (PG) is a graph where nodes and edges can
have multiple properties, represented as key-value pairs. Figure 2(c) illustrates
the graph described in the above section as a property graph. In this case, the
starting date of Tim Cook as the CEO of Apple Inc is reported as a key-value
property on the :CEO edge. PGs have not a unique and standardized model; each
PG engine proposes its data model. A generic PG model definition is proposed
by [31].
Definition 3 (Property Graph). Let Lbe the set of the labels, P N be the set
of property names, and Dbe the set of property values. A property graph Gis
an edge-labeled directed multi-graph such that G= (N, E, edge, lbl, P, σ), where:
Nis a set of nodes,
Eis a set of edges between nodes, such that NE=
edge :E(N×N)is a total function that associates each edge in Ewith
a pair of nodes in N. If edge(e1)=(n1, n2),n1is the source node and n2is
the target node.
lbl : (NE)→ P(L)is a function that associates each edge or node with a
set of labels.
σ: (NE)→ P(P)is a function that associates a node or edge with a
non-empty set of properties Pdefined as a set of key-value pairs (k, v) where
kP N and vD
To ease all approaches’ output representation, we map any IRI to a distinct
string representing a local name. Given I, a set of all IRIs, localN ame is a
function that maps an IRI to a string that represents the local name of an RDF
resource6. For example, the local name for the RDF resource (http://example.
com/meets)localName("http://example.com/meets") is "meets". We will use
this function in the output representation in Section 6.
3 Related Work
We can distinguish between related work on converting between (i) RDF and
PG and (ii) RDF-star and PG.
6In Neo4j, the user can configure the local name of RDF terms such as subProper-
tyOf, subClassOf, Class, etc.
5
摘要:

TransformingRDF-startoPropertyGraphs:APreliminaryAnalysisofTransformationApproachesextendedversionGhadeerAbuoda1,DanieleDell'Aglio1,ArthurKeen2,andKatjaHose11DepartmentofComputerScience,AalborgUniversity,Aalborg,Denmark{gsmas,dade,khose}@cs.aau.dk2ArangoDB,SanFrancisco,UnitedStatesarthur@arangodb.c...

展开>> 收起<<
Transforming RDF-star to Property Graphs A Preliminary Analysis of Transformation Approaches extended version.pdf

共40页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:40 页 大小:1.39MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 40
客服
关注