Citation Trajectory Prediction via Publication Inuence Representation Using Temporal Knowledge Graph Chang ZongYueting ZhuangWeiming LuJian ShaoSiliang Tang

2025-04-27 0 0 1.01MB 9 页 10玖币
侵权投诉
Citation Trajectory Prediction via Publication Influence Representation Using
Temporal Knowledge Graph
Chang ZongYueting ZhuangWeiming LuJian ShaoSiliang Tang
Abstract
Predicting the impact of publications in science and technol-
ogy has become an important research area, which is useful
in various real world scenarios such as technology invest-
ment, research direction selection, and technology policy-
making. Citation trajectory prediction is one of the most
popular tasks in this area. Existing approaches mainly rely
on mining temporal and graph data from academic articles.
Some recent methods are capable of handling cold-start pre-
diction by aggregating metadata features of new publica-
tions. However, the implicit factors causing citations and
the richer information from handling temporal and attribute
features still need to be explored. In this paper, we propose
CTPIR, a new citation trajectory prediction framework that
is able to represent the influence (the momentum of citation)
of either new or existing publications using the history infor-
mation of all their attributes. Our framework is composed of
three modules: difference-preserved graph embedding, fine-
grained influence representation, and learning-based trajec-
tory calculation. To test the effectiveness of our framework
in more situations, we collect and construct a new tempo-
ral knowledge graph dataset from the real world, named
AIPatent, which stems from global patents in the field of
artificial intelligence. Experiments are conducted on both
the APS academic dataset and our contributed AIPatent
dataset. The results demonstrate the strengths of our ap-
proach in the citation trajectory prediction task.
1 Introduction
Distinguishing high-impact publications is crucial to
making decisions in business and research activities,
such as investment in technology fields, selection of re-
search topics, and development policymaking. Citations
of a publication are usually applied to evaluate its po-
tential impact. The question of how to predict citations
has attracted more attention in recent years. With the
development of knowledge graph technologies, the key
issue is how to utilize the information provided by a
Zhejiang University, Hangzhou. zongchang@zju.edu.cn,
yzhuang@zju.edu.cn, luwm@zju.edu.cn, jshao@zju.edu.cn, sil-
iang@zju.edu.cn
Corresponding author.
knowledge graph to predict a publication’s future cita-
tion trend.
Existing methods on this problem can be summa-
rized in three ways. The first approach [33, 27] tries
to make use of prior knowledge and network techniques
by assuming that citation trajectories obey the Power
Law or log-normal functions. Traditional statistical
methods are applied to make predictions. Another way
[14, 1] focuses on taking advantage of text features of
abstracts and reviews. Features are fed into recurrent
neural network (RNN) models for time-series predic-
tions. With the increasing popularity of graph neural
networks (GNNs), recent works attempt to apply var-
ious structural learning models to exploit information
from attributes of publications [6, 31, 9, 13, 8].
authors keywords institutes
citation citation
Figure 1: A diagram illustrating that citations of
publications can be affected by their attributes and
relation types in different levels.
However, citations are affected by many potential
factors, and there is a lot of implicit information that
must be considered in practice. For example, attributes
of a publication, such as authors and keywords, should
be treated significantly. The reputation of a scholar
and the popularity of a field can greatly affect future
citations of a publication. In addition, each attribute
contributes to a publication at different levels (Figure
1). The approaches in previous works simply apply
GNNs to aggregate attribute features, which leads to
a lack of fine-grained influence expression. On the basis
of the above knowledge, a more powerful framework
for predicting citation trajectories with the influence of
arXiv:2210.00450v1 [cs.AI] 2 Oct 2022
publications derived from a temporal knowledge graph
is needed to handle the problem: How to represent
and calculate the influence of a publication using
as much of its information as possible?
Current studies on temporal knowledge graphs try
to manage changes in two adjacent snapshots, assuming
that nodes should update smoothly or evolve dramati-
cally [7, 26]. However, these assumptions require one to
manually set a change rate to limit the evolution, which
is not flexible. Existing works still mainly focus on han-
dling structural and temporal features in separate steps,
which leads to a lack of expression to treat dynamic
graphs as a whole. Furthermore, accumulative citations
are usually modeled as log-normal or cumulative dis-
tribution functions [9, 2]. It is still worth trying some
alternatives to perform a further analysis. The poten-
tial enhancement mentioned above should be studied to
handle another problem: How can we improve the
expressiveness of the framework for prediction
tasks using temporal knowledge graphs?
With the observations above, we propose CT-
PIR (Citation Trajectory Prediction via Influence
Representation), a new framework to predict citation
trajectories with influence representation using tempo-
ral knowledge graphs. First, we optimize the R-GCN
mechanism [22] to automatically learn the gaps between
two adjacent snapshots. Second, we implement a fine-
grained influence (citation momentum) representation
module to make use of all historical information from a
publication’s attributes. Third, a learnable general lo-
gistic function is applied to fit the trajectories using the
influence representation from the previous module.
We experiment our framework with two real world
datasets. One is APS1, a public dataset of academic
papers. Another, named AIPatent, is a new dataset
that we construct with global patents in the field of
artificial intelligence. Compared to some baselines, the
results show that CTPIR outperforms those methods in
all cases.
Our key contributions are summarized in the fol-
lowing points:
Novel framework: We propose a new framework,
named CTPIR, which implements a fine-grained in-
fluence representation approach using a more ex-
pressive temporal graph learning process and opti-
mizes existing methods to bring prediction results
much closer to observations.
Improved evaluation: We construct a new tem-
poral knowledge graph dataset named AIPatent for
the task, which is also a strong supplement for the
1https://journals.aps.org/datasets
community to carry out various temporal graph
studies. We also design and implement multiple
subtasks to evaluate approaches from a more com-
prehensive view.
Multifaceted analysis: We analyze the experi-
mental performance from multiple aspects. Expla-
nations on how CTPIR performs better compared
to other recent approaches are discussed. Some
weaknesses and further efforts are also mentioned
to guide future studies.
The dataset we use in this work, including our
AIPatent contributed dataset, and the code to re-
produce are available in our GitHub repository:
https://github.com/changzong/CTPIR
2 Related Work
2.1 Citation and Popularity Prediction. Mod-
ern approaches to citation count prediction (CCP) aim
to combine attribute information with temporal fea-
tures. GNNs are commonly used to capture topolog-
ical features of citation networks. The encoded nodes
are sent to RNNs or attention models for time-series
forecasting. A previous work [8] follows this simple
encoder-decoder architecture. Some previous studies
[30, 13, 25, 32] put emphasis on cascade graphs for pop-
ularity prediction, using a kernel method to estimate
structural similarities. These works are based on sim-
ply combining graph embedding with time-series meth-
ods. In contrast, we introduce a method to fully utilize
all past characteristics of a publication’s attributes. A
recent work called HINTS [9] adds an imputation mod-
ule to aggregate the information from each snapshot of
graphs. Another work proposes a heterogeneous dynam-
ical graph neural network (HDGNN) [29] to predict the
cumulative impact of articles and authors. The latest
work [24] uses an attention mechanism to represent the
sequence of content from citation relations. Although
these works can take advantage of richer information,
their lack of fine-grained design to represent the influ-
ence of a publication is not conducive to achieving good
prediction performance.
2.2 Temporal Graph Embedding. We focus on
deep learning-based temporal graph embedding ap-
proaches. Several previous works implement a straight-
forward way to combine GCN and RNN models to ex-
tract structural and temporal features [18, 23, 5]. RNN
variants are applied as the temporal module to perform
downstream tasks such as anomaly detection. Mean-
while, temporal attention models can be a substitute
for GCN to extract topological features [28, 15, 17]. A
recent paper [3] tries to represent global structural in-
摘要:

CitationTrajectoryPredictionviaPublicationInuenceRepresentationUsingTemporalKnowledgeGraphChangZong*YuetingZhuang„WeimingLuJianShaoSiliangTangAbstractPredictingtheimpactofpublicationsinscienceandtechnol-ogyhasbecomeanimportantresearcharea,whichisusefulinvariousrealworldscenariossuchastechnologyi...

展开>> 收起<<
Citation Trajectory Prediction via Publication Inuence Representation Using Temporal Knowledge Graph Chang ZongYueting ZhuangWeiming LuJian ShaoSiliang Tang.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:1.01MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注