Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting Hongyuan Yu123 Ting Li1 Weichen Yu23 Jianguo Li1y Yan Huang23 Liang Wang23y Alex Liu1

2025-04-26 0 0 4.13MB 8 页 10玖币

侵权投诉

Regularized Graph Structure Learning with Semantic Knowledge

for Multi-variates Time-Series Forecasting

Hongyuan Yu123∗, Ting Li1∗, Weichen Yu23∗, Jianguo Li1†, Yan Huang23 , Liang Wang23†, Alex Liu1

1Ant Group

2AI School, University of Chinese Academy of Sciences

3CRIPAC, NLPR, Institute of Automation, Chinese Academy of Sciences, China

{hongyuan.yu,weichen.yu}@cripac.ia.ac.cn, {lt317068, lijg.zero, alexliu}@antgroup.com,

{yhuang, wangliang}@nlpr.ia.ac.cn

Abstract

Multivariate time-series forecasting is a critical task

for many applications, and graph time-series net-

work is widely studied due to its capability to

capture the spatial-temporal correlation simultane-

ously. However, most existing works focus more

on learning with the explicit prior graph structure,

while ignoring potential information from the im-

plicit graph structure, yielding incomplete struc-

ture modeling. Some recent works attempts to

learn the intrinsic or implicit graph structure di-

rectly, while lacking a way to combine explicit

prior structure with implicit structure together. In

this paper, we propose Regularized Graph Struc-

ture Learning (RGSL) model to incorporate both

explicit prior structure and implicit structure to-

gether, and learn the forecasting deep networks

along with the graph structure. RGSL consists of

two innovative modules. First, we derive an im-

plicit dense similarity matrix through node embed-

ding, and learn the sparse graph structure using

the Regularized Graph Generation (RGG) based

on the Gumbel Softmax trick. Second, we pro-

pose a Laplacian Matrix Mixed-up Module (LM3)

to fuse the explicit graph and implicit graph to-

gether. We conduct experiments on three real-word

datasets. Results show that the proposed RGSL

model outperforms existing graph forecasting algo-

rithms with a notable margin, while learning mean-

ingful graph structure simultaneously. Our code

and models are made publicly available at https:

//github.com/alipay/RGSL.git.

1 Introduction

The spatial-temporal graph network [Yu et al., 2018; Cao et

al., 2020; Chen et al., 2019; Shang et al., 2021]enhances

time-series forecasting by modelling the correlation as well

as the relationship between multivariate time-series. It has

many applications. For example, the trafﬁc ﬂow forecasting is

∗equal contribution, This work was partly done when Hongyuan

Yu was intern at Ant Group.

†corresponding author

Figure 1: Graph Visualization (a) naive explicit GSL trained from

prior knowledge; (b) implicit GSL from popular network AGCRN;

a basic yet important application in intelligent transportation

system, which constructs the spatial dependency graph with

road distance. The cloud service ﬂow forecasting is a funda-

mental task in cloud serving system and e-commerce domain,

which builds the relationship graph based on request region

or zone information. Existing graph time-series forecasting

networks like STGCN [Yu et al., 2018], DCRNN [Li et al.,

2018]and ASTGCN [Guo et al., 2019]exploit ﬁxed graph

structure constructed with domain expert knowledge to cap-

ture the multi-variate time-series relationship. The explicit

graph is not always available in every applications or may be

incomplete as it is hard for human expert to capture latent or

long-range dependence among substantial time-series. Thus

how to deﬁne accurate dynamic relationship graph becomes

a critical task for graph time-series forecasting.

As the quality of graph structure impacts the performances

of graph time-series forecasting greatly, many recent efforts

[Lu et al., 2021; Bai et al., 2020; Chen et al., 2021a]have

made for Graph Structure Learning (GSL). For instance, GTS

[Shang et al., 2021]have been proposed to learn the discrete

graph structure simultaneously with GNN. AGCRN in [Bai

et al., 2020]is proposed to learn the similarity matrix de-

rived from trainable adaptive node embedding and forecast-

ing in an end-to-end style. DGNN [Lu et al., 2021]is a

dynamic graph construction method which learns the time-

speciﬁc spatial adjacency matrix ﬁrstly and then exploits dy-

namic graph convolution to pass the message. However, these

aforementioned methods go to another extreme that they learn

the intrinsic/implicit graph structure from time-series patterns

directly, while ignoring the possibility to leverage priori time-

series relationships deﬁned from domain expert knowledge.

In this paper, we focus on solving two problems, the ﬁrst is

how to take advantage of combining the explicit time-series

relationship with implicit correlations effectively in an end-

arXiv:2210.06126v1 [cs.LG] 12 Oct 2022

to-end way; the second is how to regularize the learned graph

to be sparse which ﬁlters out the redundant useless edges

thus improves overall performances, and is more valuable to

real-world applications. To address these issues, ﬁrstly we

introduce the Regularized Graph Generation (RGG) module

to learn the implicit graph, which adopts the Gumbel Softmax

trick to sparsify the dense similarity matrix from node embed-

ding. Second, we introduce the Laplacian Matrix Mixed-up

Module (LM3) to incorporate the explicit relationship from

domain knowledge with the implicit graph from RGG. Fig-

ure1 shows the graph structure learned from only explicit re-

lationship in (a), both implicit and explicit relationship with-

out regularization (b), as well as from our proposed RGSL

shown in (c). We can observe that RGSL can discover the

implicit time-series relationship ignored by naive graph struc-

ture learning algorithm(shown in red boxes in Figure1(a)).

Besides, compared to Figure 1(b), the regularization module

in RGSL which automatically removes the noisy/redundant

edges making the learned graph more sparse, as well as more

effective than dense graph.

To summarize, our work presents the following contribu-

tions.

• We propose a novel and efﬁcient model named RGSL

which ﬁrst exploits both explicit and implicit time-series

relationship to assist graph structure learning, and our

proposed LM3module effectively mixes up two kinds

of Laplacian matrix collectively.

• Besides, to regularize the learned matrix, we also pro-

pose a RGG module which formulates the discrete graph

structure as a variable independent matrix and exploits

the Gumbel softmax trick to optimize the probabilistic

graph distribution parameters.

• Extensive experiments show the proposed model RGSL

signiﬁcantly outperforms benchmarks on three datasets

consistently. Moreover, both the LM3module and RGG

module can be easily generalized to different spatio-

temporal graph models.

2 Methodology

In this section, we ﬁrst introduce problem deﬁnition and no-

tations, and then describe the detailed implementation of pro-

posed RGSL. The overall pipeline is shown in Figure 2. The

RGSL consists of three major modules, and the ﬁrst is the reg-

ularized graph generation module name RGG which learns

the discrete graph structure with trainable node embeddings

in Section 2.2 with Gumbel softmax trick. The second is the

Laplacian matrix mix-up module named LM3in 2.3 which

captures both explicit and implicit time-series correlations be-

tween nodes in a convex way. Finally, in 2.4, we utilize re-

current graph network to perform time-series forecasting con-

sidering both the spatial correlation and temporal dependency

simultaneously.

2.1 Preliminary

The trafﬁc series forecasting is to predict the future time

series from historical trafﬁc records. Denote the training

data by X0:T={X0,X1,...,Xt,...,XT}, and Xt=

{X0

t,X1

t,...,XN

t}, where the superscript refers to series

and subscript refers to time. There are total T timestamps

for training and τtimestamps required for trafﬁc forecast-

ing. We denote G(0) as the explicit graph constructed with

priori time-series relationship, and G(l)as the implicit graph

learned from trainable node embeddings, and the vertex of

graph G(l)represents trafﬁc series X, and A∈RN×Nis the

adjacent matrix of the graph Grepresenting the similarity be-

tween time-series. Thus, the time-series forecasting with the

explicit graph task can be deﬁned as:

min

Wθ

L(XT+1:T+τ,

XT+1:T+τ;X0:T,G(0),G(l))(1)

where Wθdenotes all the learnable parameters, ˆ

XT+1:T+τ

denotes the ground truth future values, Lis the loss function.

2.2 Regularized Graph Generation

Regularization method Dropout [Srivastava et al., 2014]

aims at preventing neural networks from overﬁtting by ran-

domly drop connections during training. However, traditional

Dropout equally treats every connection and drop them with

the same distribution acquired from cross-validation, which

doesn’t consider the different signiﬁcance of different edges.

In our Regularized Graph Generation(RGG) module, inspired

by [Shang et al., 2021]and works in reinforcement learning,

we simply resolve the regularization problem with Gumble

Softmax to replace Softmax, which is super convenient to

employ, increases the explainability of prediction and shows

nice improvements. Another motivation of applying Gumble

Softmax trick is to alleviate the density of the learned matrix

after training from GNNs.

Let E∈RN×dbe the learned node embedding matrix, d

is the embedding dimension, θis the probability matrix then

θij ∈θrepresents the probability to preserve the edge of

time-series ito j, which is formulated as:

θ=EE>(2)

Let σbe activation function and sis the temperature variable,

then the sparse adjacency matrix A(l)is deﬁned as:

A(l)=σ((log(θij /(1 −θij )) + (g1

ij −g2

ij ))/s)

s.t. g1

ij , g2

ij ∼Gumbel(0,1) (3)

Equation 3 is the Gumbel softmax implementation of our task

where the A(l)

ij = 1 with the probability θij and 0 with re-

maining probability. It can be easily proved that Gumbel

Softmax shares the same probability distribution as the nor-

mal Softmax, which ensures that the graph forecasting net-

work keeps consistent with the trainable probability matrix

generation statistically.

At each iteration, we calculate the adjacent matrix θas the

Equation 2 suggests, Gumbel-Max samples the adjacent ma-

trix to determine which edge to preserve and which to dis-

card, which is similar to Dropout. However, dropout ran-

domly selects edges or neurons with equal probability, while

we drop out useful edges with small likelihood and tend to

get rid of those redundant edges. As in Figure. 4(a), all the

non-diagonal entries are non-zero, but substantial amounts of

them are small-value and regarded as useless or even noisy.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

RegularizedGraphStructureLearningwithSemanticKnowledgeforMulti-variatesTime-SeriesForecastingHongyuanYu123,TingLi1,WeichenYu23,JianguoLi1y,YanHuang23,LiangWang23y,AlexLiu11AntGroup2AISchool,UniversityofChineseAcademyofSciences3CRIPAC,NLPR,InstituteofAutomation,ChineseAcademyofSciences,China{hongy...

展开>> 收起<<

Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting Hongyuan Yu123 Ting Li1 Weichen Yu23 Jianguo Li1y Yan Huang23 Liang Wang23y Alex Liu1.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Regularized Graph Structure Learning with Semantic Knowledge for Multi-variates Time-Series Forecasting Hongyuan Yu123 Ting Li1 Weichen Yu23 Jianguo Li1y Yan Huang23 Liang Wang23y Alex Liu1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: