Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer Ziwei Fan Zhiwei Liuy Chen Wang Peijie Huangz Hao Pengx Philip S. Yu

2025-05-03 1 0 793.79KB 11 页 10玖币

侵权投诉

Sequential Recommendation with Auxiliary Item

Relationships via Multi-Relational Transformer

Ziwei Fan∗, Zhiwei Liu†, Chen Wang∗, Peijie Huang‡, Hao Peng§, Philip S. Yu∗

∗Department of Computer Science, University of Illinois Chicago, Chicago, USA

{zfan20, cwang266, psyu}@uic.edu

†Salesforce AI Research, Palo Alto, USA; zhiweiliu@salesforce.com

‡South China Agricultural University, Guangzhou, China; pjhuang@scau.edu.cn

§School of Cyber Science and Technology, Beihang University, Beijing, China; penghao@act.buaa.edu.cn

Abstract—Sequential Recommendation (SR) models user dy-

namics and predicts the next preferred items based on the

user history. Existing SR methods model the ‘was interacted

before’ item-item transitions observed in sequences, which can

be viewed as an item relationship. However, there are multiple

auxiliary item relationships, e.g., items from similar brands and

with similar contents in real-world scenarios. Auxiliary item

relationships describe item-item afﬁnities in multiple different

semantics and alleviate the long-lasting cold start problem in the

recommendation. However, it remains a signiﬁcant challenge to

model auxiliary item relationships in SR.

To simultaneously model high-order item-item transitions in

sequences and auxiliary item relationships, we propose a Multi-

relational Transformer capable of modeling auxiliary item rela-

tionships for SR (MT4SR). Speciﬁcally, we propose a novel self-

attention module, which incorporates arbitrary item relationships

and weights item relationships accordingly. Second, we regularize

intra-sequence item relationships with a novel regularization

module to supervise attentions computations. Third, for inter-

sequence item relationship pairs, we introduce a novel inter-

sequence related items modeling module. Finally, we conduct

experiments on four benchmark datasets and demonstrate the

effectiveness of MT4SR over state-of-the-art methods and the

improvements on the cold start problem. The code is available

in https://github.com/zfan20/MT4SR.

Index Terms—Sequential Recommendation, Self-Attention,

Item Relationships

I. INTRODUCTION

Sequential Recommendation (SR) draws increasing atten-

tion due to its superior dynamic user modeling and scalability.

SR models the dynamics in the sequence and predicts the next

preferred item. SR learns dynamic user interests by modeling

item-item transitions observed in sequences. These item-item

transitions can be treated as a type of relationship between

items with the temporal order, which we can deﬁne as ‘was

interacted before.’ Among existing SR advancements, includ-

ing Markov Chain methods [1] and RNN-based methods [2],

Transformer architecture [3] achieves great success and in-

spires many contributions because of the capability of mod-

eling high-order item-item transitions. Several Transformer-

based methods [4, 5, 6] demonstrate the effectiveness for SR.

Modeling item-item transitions in SR is insufﬁcient for

satisfactory item embeddings learning because of the cold start

𝑆!!:i"i#i$

Time

𝑆!":i%i!i&i#

Similar_Brand

Similar_Content

Auxiliary Item Relationships

(i!,similar_brand, i")

(i#,similar_content, i$)

Item-Item Transitions (𝑢!)

1st-order: ( i%,was interacted before, i&)

1st-order: ( i!,was interacted before, i%)

1st-order: ( i$,was interacted before, i!)

2nd-order: ( i!,was interacted before, i&)

2nd-order: ( i$,was interacted before, i%)

3rd-order: ( i$,was interacted before, i&)Was interacted before

Relationships

Fig. 1: A toy example of two sequences with two item

relationships ‘similar brand’ and ‘similar content’. For item-

item transitions from sequences (black arrow), we deﬁne

them as ‘was interacted before’ asymmetric relationship. The

auxiliary item relationships include intra-sequence related item

pairs, e.g., (i3, ‘similar content’, i2) can be observed as (i3,

was interacted before, i2) in Su2, and inter-sequence related

item pairs, e.g., (i1, ‘similar brand’, i6) crossing Su1and Su2.

problem. In real-world applications, there are multiple auxil-

iary item relationships, such as related items based on textual

descriptions, search data, brands, and categorical connections.

Auxiliary item relationships are a set of related item pairs un-

der multiple relationships. It has been demonstrated that these

auxiliary related item pairs beneﬁt recommendation with great

performance gains [7, 8, 9, 10]. Although Transformer-based

SR methods demonstrated the effectiveness, these methods

cannot model auxiliary item relationships. In SR, higher-order

item-item transitions can help but with limited contributions,

as shown in Table I. In Table I, higher-order item transitions

still can accurately predict testing item pairs. However, 2nd-

order and 3rd-order performances are worse than the 1st-

order, which justiﬁes the idea of Markov Chains models [1]

and Transformer models [4, 5, 6]. From the bottom part of

Table I, we can conclude that auxiliary item relationships have

much higher hit ratios than pure item-item transitions and are

potentially helpful for SR.

It is rather challenging to model auxiliary item relationships

arXiv:2210.13572v2 [cs.IR] 28 Oct 2022

come from several perspectives: (1). the compatibility of item

transitions and item relationships in self-attention; (2). proper

supervision of related item pairs within sequences; (3). inter-

sequences related item pairs are dominant.

First, the standard self-attention module [3, 4, 5, 6, 11, 12,

13, 14, 15] is not compatible to handle auxiliary item rela-

tionships as it only captures single item relationship observed

in item-item transitions. Speciﬁcally, modeling auxiliary item

relationships requires representing item relationships as at-

tention values, which should be compatible with the scaled

dot product self-attentions and demands theoretical support.

Moreover, auxiliary item relationships are relationship-aware.

Each relationship contributes to the next item recommendation

unequally. For example, related item pairs observed from ‘co-

searched’ can better reﬂect users’ intents than ‘similar in

brand’.

Second, representing auxiliary item relationships via scaled

dot product in self-attention still lacks correct supervisions and

potentially misleads the attention calculation. For related item

pairs observed in item-item transitions within sequences (intra-

sequence), the dot product attention scores need to match well

with relatedness signals, with the goal of correct guidance of

proper attention computations. Without sufﬁcient supervisions,

attention scores from intra-sequence item relatedness are only

random and free learnable parameters.

Third, most related item pairs are not intra-sequence but

inter-sequences [16], e.g., (i1,i6) in Fig. 1. However, inter-

sequences related items enrich collaborative signals by con-

necting sequences with related items. The proportional ratio

of intra-sequences related item pairs is small (≤10%), as

shown in Table II. It indicates more than 90% of related item

pairs are inter-sequences. As intra-sequences related item pairs

overlap with item-item transitions, these pairs only capture

known information. Nevertheless, inter-sequences related item

pairs signiﬁcantly beneﬁt the sequential recommendation. For

example, in Fig. 1, given the history of [i3, i2, i6]of the

user u2, we fail to observe sufﬁcient item-item transitions for

signaling the next item i5because Su1and Su2have a small

collaborative similarity based on histories. Moreover, i3and

i6are cold items. However, with the help of inter-sequence

related pair (i1, ‘similar brand’ i6), we can draw additional

collaborative connections between u1and u2, and correctly

recommend i5. These additive connections incorporate more

general collaborative signals rather than collaborative similar-

ities based on interacted items.

In this paper, we develop a Multi-relational Transformer

capable of processing auxiliary item relationships for

SR (MT4SR). MT4SR includes three core modules: (1).

a multi-relational self-attention module designed for seam-

lessly incorporating auxiliary item relationships into the self-

attention module; (2). a novel intra-sequence regularization

term that supervises the related item pairs self-attention scores

learning; (3). an explorative inter-sequences related items

regularization that models related item pairs unobserved in

sequences and further introduces additional collaborative sig-

nals for connecting similar behaviors. The contributions of this

TABLE I: Testing item pair (next to last item, last item)

(e.g., the (i5,i4) pair of user u1in Fig. 1. Hit Ratio (HR)

measures the percentage of testing item pairs captured by

different orders of item transition pairs in training sequences

and different item relationships (e.g., the (i1, ‘similar brand’,

i6) pair in Fig. 1). We adopt relationship ‘also viewed,’ ‘also

bought,’ ‘bought together, and ‘buy after viewing’ as auxiliary

item relationships in four categories of the Amazon dataset.

Dataset Beauty Toys Tools Ofﬁce

1st-order transition HR 8.60% 8.06% 4.64% 11.74%

2nd-order transition HR 5.82% 4.12% 2.48% 8.95%

3rd-order transition HR 4.11% 2.22% 1.63% 6.44%

Total transition HR 18.54% 14.41% 8.76% 27.14%

related item pairs HR 22.94% 27.26% 16.93% 29.19%

TABLE II: Intra-Sequence Related Item Pairs Coverage, which

is calculated as 1

|U| Pu∈U

|I∩{(vi,vj)∈Su×Su}|

|Su|∗|Su|, where I

refers to the set of related item pairs, ×denotes set outer

product, ∩denotes the set intersection, |·| refers to size of set.

The deﬁnitions of used symbols can be found in Section III-A.

Dataset Beauty Toys Tools Ofﬁce

Sparsity 4.58% 7.03% 3.37% 3.16%

work are as follows:

•We propose a novel and general multi-relational self-

attention Transformer framework to seamlessly incorporate

auxiliary item relationships in SR.

•Inspired by the connection between self-attention and

knowledge embeddings, we incorporate a novel item relat-

edness scoring in self-attention.

•We introduce two novel regularization terms for supervising

intra-sequence related item pairs in the multi-relational self-

attention and also explore inter-sequences related item pairs

to explore additional collaborative signals across sequences.

•We demonstrate that MT4SR outperforms state-of-the-art

recommendation methods with improvements from 3.56% to

21.87% in all metrics on four benchmark datasets, including

static methods, sequential methods, and methods using item

relationship information.

TABLE III: Model comparison. ‘H-R’: Models auxiliary item

relationships? ‘H-O’: Models high-order information?

Capability Personalized Sequential H-O H-R

BPRMF [1] 3 7 7 7

LightGCN [17] 3 7 7 3

SASRec [4] 3 3 3 7

KGAT [9] 3 7 3 3

KGIN [8] 3 7 3 3

RCF [18] 3 7 3 3

MoHR [7] 3 3 7 7

MT4SR (proposed) 3 3 3 3

II. RELATED WORK

This section discusses existing methods related to our

problem and the proposed method. We ﬁrst introduce relevant

methods in the sequential recommendation as it matches our

task. Then we discuss existing methods for incorporating

item relationships. Finally, we also introduce related works

from knowledge graph recommendations because these works

also model additional item knowledge. The summary and

capabilities comparison of models are presented in Table III.

A. Sequential Recommendation

Sequential Recommendation (SR) predicts the next pre-

ferred item by modeling the chronologically sorted sequence

of users’ historical interactions. With the sequential modeling

in the user’s interaction sequence, SR captures the dynamic

preference, which is latent in item-item transitions. One line

of earliest works originates from the idea of Markov Chains,

which are capable of learning item-item transition probabili-

ties, including FPMC [1]. FPMC [1] captures only the ﬁrst-

order item transitions with low model complexity, assuming

the next preferred item is only correlated to the previous

interacted item. Fossil [19] extends FPMC to learn higher-

order item transitions and demonstrates the necessity of high-

order item transitions in SR.

The successful demonstration of sequential modeling from

deep learning inspires research potentials of sequential models

for SR, including Recurrent Neural Network (RNN) [2, 20,

21], Convolution Neural Network (CNN) [2, 22], and Trans-

former [4, 5, 6]. The representative work of RNN for SR is

GRU4Rec [23], which adopts the Gated Recurrent Unit (GRU)

in the session-based recommendation. Another line of SR is

CNN-based methods, such as Caser [22]. Caser [22] treats

the interaction sequence with item embeddings as an im-

age and applies convolution operators to learn local sub-

sequence structures. The recent success of self-attention-based

Transformer [3] architecture provides more possibilities in SR

due to its capability of modeling all pair-wise relationships

within the sequence, which is the limitation of RNN-based

methods and CNN methods. SASRec [4] is the ﬁrst work

adopting the Transformer for SR and demonstrates its su-

periority. BERT4Rec [5] extends the SASRec to model bi-

directional relationships in Transformers, with the inspiration

of BERT [24]. TiSASRec [25] further incorporates time differ-

ence information in SASRec. FISSA [26] explores latent item

similarities in SR. DT4Rec [27] and STOSA [6] model items

as distributions instead of vector embedding and are state-of-

the-art SR methods with implicit feedback.

Despite the recent success of SR methods, they still fail to

incorporate heterogeneous item relationships into the modeling

of item-item transitions, especially in high-order transitions.

Distinctly, the proposed MT4SR can model both item-item

purchase transitions and additional item relationships in a

uniﬁed framework, which can be easily extended to various

numbers of relationships.

B. Item Relationships-aware Recommendation

Some methods propose to utilize extra item relation-

ships [28, 29] to enhance the representation capability of item

embeddings. For example, Chorus [30] speciﬁcally models

substitute and complementary relationships between items

in the continuous-time dynamic recommendation scenario.

RCF [18] proposes to model item relationships in a two-level

hierarchy in a graph learning framework. UGRec [31] extends

the idea of RCF and adopts the translation knowledge embed-

ding approach within the graph recommendation framework

to model both directed and undirected relationships for the

recommendation. MoHR [7] is the most relevant work to this

paper. MoHR incorporates item relationships into ﬁrst-order

user-item translation scoring and proposes optimizing the next

relationship prediction, which can identify the importance of

each relationship in the dynamic sequence.

Although these methods signiﬁcantly improve the recom-

mendation, they still obtain sub-optimal performance in rec-

ommendation and efﬁciency. Chorus can only handle substitute

and complementary relationships for sequential recommenda-

tion while more item relationships exist, and identifying the

signiﬁcance of relationships is also crucial. RCF and UGRec

both rely on the graph modeling framework, which sometimes

requires a large amount of graphical memory due to the

exponential growth neighbors. Furthermore, neither RCF nor

UGRec can handle dynamic user preferences. MoHR only

models the ﬁrst-order translation between user and item under

the relationship space [4].

C. Knowledge Graph Recommendation

Knowledge graph recommendation [8, 9, 32, 33, 34, 35]

originates from knowledge embeddings learning, where the

knowledge graph consists of the triplets describing entities

and their relationships. The classical line of knowledge graph

recommendation is embedding-based methods, which adopt

knowledge embedding techniques to learn entity and relation

embeddings, such as TransE [36], and DistMult [37]. The

representative work is CKE [38]. CKE utilizes TransE to learn

knowledge embeddings and regularizes the matrix factoriza-

tion. KTUP applies TransE to model both knowledge triplets

and user-item interactions. Another line of work is path-based

methods, in which RippleNet [39] is the representative work.

RippleNet starts paths from each user and aggregates item

embeddings with the path. The most state-of-the-art meth-

ods are based on collaborative knowledge graphs, including

KGAT [9] and KGIN [8]. Both KGAT and KGIN combine

the item knowledge graph and the user-item interaction graph

as a uniﬁed graph. KGAT applies TransE scores as attention

weights for node message aggregation. KGIN extends KGAT

by modeling paths as intents.

III. PRELIMINARIES

A. Problem Deﬁnition

Given a set of users Uand items V, and the associated in-

teractions, we ﬁrst sort the interacted items of each user u∈ U

chronologically in a sequence as Su= [vu

1, vu

2, . . . , vu

|Su|],

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SequentialRecommendationwithAuxiliaryItemRelationshipsviaMulti-RelationalTransformerZiweiFan,ZhiweiLiuy,ChenWang,PeijieHuangz,HaoPengx,PhilipS.YuDepartmentofComputerScience,UniversityofIllinoisChicago,Chicago,USAfzfan20,cwang266,psyug@uic.eduySalesforceAIResearch,PaloAlto,USA;zhiweiliu@salesforc...

展开>> 收起<<

Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer Ziwei Fan Zhiwei Liuy Chen Wang Peijie Huangz Hao Pengx Philip S. Yu.pdf

共11页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer Ziwei Fan Zhiwei Liuy Chen Wang Peijie Huangz Hao Pengx Philip S. Yu

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: