Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer Ziwei Fan Zhiwei Liuy Chen Wang Peijie Huangz Hao Pengx Philip S. Yu

2025-05-03 0 0 793.79KB 11 页 10玖币
侵权投诉
Sequential Recommendation with Auxiliary Item
Relationships via Multi-Relational Transformer
Ziwei Fan, Zhiwei Liu, Chen Wang, Peijie Huang, Hao Peng§, Philip S. Yu
Department of Computer Science, University of Illinois Chicago, Chicago, USA
{zfan20, cwang266, psyu}@uic.edu
Salesforce AI Research, Palo Alto, USA; zhiweiliu@salesforce.com
South China Agricultural University, Guangzhou, China; pjhuang@scau.edu.cn
§School of Cyber Science and Technology, Beihang University, Beijing, China; penghao@act.buaa.edu.cn
Abstract—Sequential Recommendation (SR) models user dy-
namics and predicts the next preferred items based on the
user history. Existing SR methods model the ‘was interacted
before’ item-item transitions observed in sequences, which can
be viewed as an item relationship. However, there are multiple
auxiliary item relationships, e.g., items from similar brands and
with similar contents in real-world scenarios. Auxiliary item
relationships describe item-item affinities in multiple different
semantics and alleviate the long-lasting cold start problem in the
recommendation. However, it remains a significant challenge to
model auxiliary item relationships in SR.
To simultaneously model high-order item-item transitions in
sequences and auxiliary item relationships, we propose a Multi-
relational Transformer capable of modeling auxiliary item rela-
tionships for SR (MT4SR). Specifically, we propose a novel self-
attention module, which incorporates arbitrary item relationships
and weights item relationships accordingly. Second, we regularize
intra-sequence item relationships with a novel regularization
module to supervise attentions computations. Third, for inter-
sequence item relationship pairs, we introduce a novel inter-
sequence related items modeling module. Finally, we conduct
experiments on four benchmark datasets and demonstrate the
effectiveness of MT4SR over state-of-the-art methods and the
improvements on the cold start problem. The code is available
in https://github.com/zfan20/MT4SR.
Index Terms—Sequential Recommendation, Self-Attention,
Item Relationships
I. INTRODUCTION
Sequential Recommendation (SR) draws increasing atten-
tion due to its superior dynamic user modeling and scalability.
SR models the dynamics in the sequence and predicts the next
preferred item. SR learns dynamic user interests by modeling
item-item transitions observed in sequences. These item-item
transitions can be treated as a type of relationship between
items with the temporal order, which we can define as ‘was
interacted before.’ Among existing SR advancements, includ-
ing Markov Chain methods [1] and RNN-based methods [2],
Transformer architecture [3] achieves great success and in-
spires many contributions because of the capability of mod-
eling high-order item-item transitions. Several Transformer-
based methods [4, 5, 6] demonstrate the effectiveness for SR.
Modeling item-item transitions in SR is insufficient for
satisfactory item embeddings learning because of the cold start
i!
𝑆!!:i"i#i$
Time
𝑆!":i%i!i&i#
Similar_Brand
Similar_Content
Auxiliary Item Relationships
(i!,similar_brand, i")
(i#,similar_content, i$)
Item-Item Transitions (𝑢!)
1st-order: ( i%,was interacted before, i&)
1st-order: ( i!,was interacted before, i%)
1st-order: ( i$,was interacted before, i!)
2nd-order: ( i!,was interacted before, i&)
2nd-order: ( i$,was interacted before, i%)
3rd-order: ( i$,was interacted before, i&)Was interacted before
Relationships
Fig. 1: A toy example of two sequences with two item
relationships ‘similar brand’ and ‘similar content’. For item-
item transitions from sequences (black arrow), we define
them as ‘was interacted before’ asymmetric relationship. The
auxiliary item relationships include intra-sequence related item
pairs, e.g., (i3, ‘similar content’, i2) can be observed as (i3,
was interacted before, i2) in Su2, and inter-sequence related
item pairs, e.g., (i1, ‘similar brand’, i6) crossing Su1and Su2.
problem. In real-world applications, there are multiple auxil-
iary item relationships, such as related items based on textual
descriptions, search data, brands, and categorical connections.
Auxiliary item relationships are a set of related item pairs un-
der multiple relationships. It has been demonstrated that these
auxiliary related item pairs benefit recommendation with great
performance gains [7, 8, 9, 10]. Although Transformer-based
SR methods demonstrated the effectiveness, these methods
cannot model auxiliary item relationships. In SR, higher-order
item-item transitions can help but with limited contributions,
as shown in Table I. In Table I, higher-order item transitions
still can accurately predict testing item pairs. However, 2nd-
order and 3rd-order performances are worse than the 1st-
order, which justifies the idea of Markov Chains models [1]
and Transformer models [4, 5, 6]. From the bottom part of
Table I, we can conclude that auxiliary item relationships have
much higher hit ratios than pure item-item transitions and are
potentially helpful for SR.
It is rather challenging to model auxiliary item relationships
and high-order item transitions simultaneously. The challenges978-1-6654-8045-1/22/$31.00 ©2022 IEEE
arXiv:2210.13572v2 [cs.IR] 28 Oct 2022
come from several perspectives: (1). the compatibility of item
transitions and item relationships in self-attention; (2). proper
supervision of related item pairs within sequences; (3). inter-
sequences related item pairs are dominant.
First, the standard self-attention module [3, 4, 5, 6, 11, 12,
13, 14, 15] is not compatible to handle auxiliary item rela-
tionships as it only captures single item relationship observed
in item-item transitions. Specifically, modeling auxiliary item
relationships requires representing item relationships as at-
tention values, which should be compatible with the scaled
dot product self-attentions and demands theoretical support.
Moreover, auxiliary item relationships are relationship-aware.
Each relationship contributes to the next item recommendation
unequally. For example, related item pairs observed from ‘co-
searched’ can better reflect users’ intents than ‘similar in
brand’.
Second, representing auxiliary item relationships via scaled
dot product in self-attention still lacks correct supervisions and
potentially misleads the attention calculation. For related item
pairs observed in item-item transitions within sequences (intra-
sequence), the dot product attention scores need to match well
with relatedness signals, with the goal of correct guidance of
proper attention computations. Without sufficient supervisions,
attention scores from intra-sequence item relatedness are only
random and free learnable parameters.
Third, most related item pairs are not intra-sequence but
inter-sequences [16], e.g., (i1,i6) in Fig. 1. However, inter-
sequences related items enrich collaborative signals by con-
necting sequences with related items. The proportional ratio
of intra-sequences related item pairs is small (10%), as
shown in Table II. It indicates more than 90% of related item
pairs are inter-sequences. As intra-sequences related item pairs
overlap with item-item transitions, these pairs only capture
known information. Nevertheless, inter-sequences related item
pairs significantly benefit the sequential recommendation. For
example, in Fig. 1, given the history of [i3, i2, i6]of the
user u2, we fail to observe sufficient item-item transitions for
signaling the next item i5because Su1and Su2have a small
collaborative similarity based on histories. Moreover, i3and
i6are cold items. However, with the help of inter-sequence
related pair (i1, ‘similar brand’ i6), we can draw additional
collaborative connections between u1and u2, and correctly
recommend i5. These additive connections incorporate more
general collaborative signals rather than collaborative similar-
ities based on interacted items.
In this paper, we develop a Multi-relational Transformer
capable of processing auxiliary item relationships for
SR (MT4SR). MT4SR includes three core modules: (1).
a multi-relational self-attention module designed for seam-
lessly incorporating auxiliary item relationships into the self-
attention module; (2). a novel intra-sequence regularization
term that supervises the related item pairs self-attention scores
learning; (3). an explorative inter-sequences related items
regularization that models related item pairs unobserved in
sequences and further introduces additional collaborative sig-
nals for connecting similar behaviors. The contributions of this
TABLE I: Testing item pair (next to last item, last item)
(e.g., the (i5,i4) pair of user u1in Fig. 1. Hit Ratio (HR)
measures the percentage of testing item pairs captured by
different orders of item transition pairs in training sequences
and different item relationships (e.g., the (i1, ‘similar brand’,
i6) pair in Fig. 1). We adopt relationship ‘also viewed,’ ‘also
bought,’ ‘bought together, and ‘buy after viewing’ as auxiliary
item relationships in four categories of the Amazon dataset.
Dataset Beauty Toys Tools Office
1st-order transition HR 8.60% 8.06% 4.64% 11.74%
2nd-order transition HR 5.82% 4.12% 2.48% 8.95%
3rd-order transition HR 4.11% 2.22% 1.63% 6.44%
Total transition HR 18.54% 14.41% 8.76% 27.14%
related item pairs HR 22.94% 27.26% 16.93% 29.19%
TABLE II: Intra-Sequence Related Item Pairs Coverage, which
is calculated as 1
|U| Pu∈U
|I∩{(vi,vj)∈Su×Su}|
|Su|∗|Su|, where I
refers to the set of related item pairs, ×denotes set outer
product, denotes the set intersection, |·| refers to size of set.
The definitions of used symbols can be found in Section III-A.
Dataset Beauty Toys Tools Office
Sparsity 4.58% 7.03% 3.37% 3.16%
work are as follows:
We propose a novel and general multi-relational self-
attention Transformer framework to seamlessly incorporate
auxiliary item relationships in SR.
Inspired by the connection between self-attention and
knowledge embeddings, we incorporate a novel item relat-
edness scoring in self-attention.
We introduce two novel regularization terms for supervising
intra-sequence related item pairs in the multi-relational self-
attention and also explore inter-sequences related item pairs
to explore additional collaborative signals across sequences.
We demonstrate that MT4SR outperforms state-of-the-art
recommendation methods with improvements from 3.56% to
21.87% in all metrics on four benchmark datasets, including
static methods, sequential methods, and methods using item
relationship information.
TABLE III: Model comparison. ‘H-R’: Models auxiliary item
relationships? ‘H-O’: Models high-order information?
Capability Personalized Sequential H-O H-R
BPRMF [1] 3 7 7 7
LightGCN [17] 3 7 7 3
SASRec [4] 3 3 3 7
KGAT [9] 3 7 3 3
KGIN [8] 3 7 3 3
RCF [18] 3 7 3 3
MoHR [7] 3 3 7 7
MT4SR (proposed) 3 3 3 3
II. RELATED WORK
This section discusses existing methods related to our
problem and the proposed method. We first introduce relevant
methods in the sequential recommendation as it matches our
task. Then we discuss existing methods for incorporating
item relationships. Finally, we also introduce related works
from knowledge graph recommendations because these works
also model additional item knowledge. The summary and
capabilities comparison of models are presented in Table III.
A. Sequential Recommendation
Sequential Recommendation (SR) predicts the next pre-
ferred item by modeling the chronologically sorted sequence
of users’ historical interactions. With the sequential modeling
in the user’s interaction sequence, SR captures the dynamic
preference, which is latent in item-item transitions. One line
of earliest works originates from the idea of Markov Chains,
which are capable of learning item-item transition probabili-
ties, including FPMC [1]. FPMC [1] captures only the first-
order item transitions with low model complexity, assuming
the next preferred item is only correlated to the previous
interacted item. Fossil [19] extends FPMC to learn higher-
order item transitions and demonstrates the necessity of high-
order item transitions in SR.
The successful demonstration of sequential modeling from
deep learning inspires research potentials of sequential models
for SR, including Recurrent Neural Network (RNN) [2, 20,
21], Convolution Neural Network (CNN) [2, 22], and Trans-
former [4, 5, 6]. The representative work of RNN for SR is
GRU4Rec [23], which adopts the Gated Recurrent Unit (GRU)
in the session-based recommendation. Another line of SR is
CNN-based methods, such as Caser [22]. Caser [22] treats
the interaction sequence with item embeddings as an im-
age and applies convolution operators to learn local sub-
sequence structures. The recent success of self-attention-based
Transformer [3] architecture provides more possibilities in SR
due to its capability of modeling all pair-wise relationships
within the sequence, which is the limitation of RNN-based
methods and CNN methods. SASRec [4] is the first work
adopting the Transformer for SR and demonstrates its su-
periority. BERT4Rec [5] extends the SASRec to model bi-
directional relationships in Transformers, with the inspiration
of BERT [24]. TiSASRec [25] further incorporates time differ-
ence information in SASRec. FISSA [26] explores latent item
similarities in SR. DT4Rec [27] and STOSA [6] model items
as distributions instead of vector embedding and are state-of-
the-art SR methods with implicit feedback.
Despite the recent success of SR methods, they still fail to
incorporate heterogeneous item relationships into the modeling
of item-item transitions, especially in high-order transitions.
Distinctly, the proposed MT4SR can model both item-item
purchase transitions and additional item relationships in a
unified framework, which can be easily extended to various
numbers of relationships.
B. Item Relationships-aware Recommendation
Some methods propose to utilize extra item relation-
ships [28, 29] to enhance the representation capability of item
embeddings. For example, Chorus [30] specifically models
substitute and complementary relationships between items
in the continuous-time dynamic recommendation scenario.
RCF [18] proposes to model item relationships in a two-level
hierarchy in a graph learning framework. UGRec [31] extends
the idea of RCF and adopts the translation knowledge embed-
ding approach within the graph recommendation framework
to model both directed and undirected relationships for the
recommendation. MoHR [7] is the most relevant work to this
paper. MoHR incorporates item relationships into first-order
user-item translation scoring and proposes optimizing the next
relationship prediction, which can identify the importance of
each relationship in the dynamic sequence.
Although these methods significantly improve the recom-
mendation, they still obtain sub-optimal performance in rec-
ommendation and efficiency. Chorus can only handle substitute
and complementary relationships for sequential recommenda-
tion while more item relationships exist, and identifying the
significance of relationships is also crucial. RCF and UGRec
both rely on the graph modeling framework, which sometimes
requires a large amount of graphical memory due to the
exponential growth neighbors. Furthermore, neither RCF nor
UGRec can handle dynamic user preferences. MoHR only
models the first-order translation between user and item under
the relationship space [4].
C. Knowledge Graph Recommendation
Knowledge graph recommendation [8, 9, 32, 33, 34, 35]
originates from knowledge embeddings learning, where the
knowledge graph consists of the triplets describing entities
and their relationships. The classical line of knowledge graph
recommendation is embedding-based methods, which adopt
knowledge embedding techniques to learn entity and relation
embeddings, such as TransE [36], and DistMult [37]. The
representative work is CKE [38]. CKE utilizes TransE to learn
knowledge embeddings and regularizes the matrix factoriza-
tion. KTUP applies TransE to model both knowledge triplets
and user-item interactions. Another line of work is path-based
methods, in which RippleNet [39] is the representative work.
RippleNet starts paths from each user and aggregates item
embeddings with the path. The most state-of-the-art meth-
ods are based on collaborative knowledge graphs, including
KGAT [9] and KGIN [8]. Both KGAT and KGIN combine
the item knowledge graph and the user-item interaction graph
as a unified graph. KGAT applies TransE scores as attention
weights for node message aggregation. KGIN extends KGAT
by modeling paths as intents.
III. PRELIMINARIES
A. Problem Definition
Given a set of users Uand items V, and the associated in-
teractions, we first sort the interacted items of each user u∈ U
chronologically in a sequence as Su= [vu
1, vu
2, . . . , vu
|Su|],
摘要:

SequentialRecommendationwithAuxiliaryItemRelationshipsviaMulti-RelationalTransformerZiweiFan,ZhiweiLiuy,ChenWang,PeijieHuangz,HaoPengx,PhilipS.YuDepartmentofComputerScience,UniversityofIllinoisChicago,Chicago,USAfzfan20,cwang266,psyug@uic.eduySalesforceAIResearch,PaloAlto,USA;zhiweiliu@salesforc...

展开>> 收起<<
Sequential Recommendation with Auxiliary Item Relationships via Multi-Relational Transformer Ziwei Fan Zhiwei Liuy Chen Wang Peijie Huangz Hao Pengx Philip S. Yu.pdf

共11页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:11 页 大小:793.79KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 11
客服
关注