II. RELATED WORK
This section discusses existing methods related to our
problem and the proposed method. We first introduce relevant
methods in the sequential recommendation as it matches our
task. Then we discuss existing methods for incorporating
item relationships. Finally, we also introduce related works
from knowledge graph recommendations because these works
also model additional item knowledge. The summary and
capabilities comparison of models are presented in Table III.
A. Sequential Recommendation
Sequential Recommendation (SR) predicts the next pre-
ferred item by modeling the chronologically sorted sequence
of users’ historical interactions. With the sequential modeling
in the user’s interaction sequence, SR captures the dynamic
preference, which is latent in item-item transitions. One line
of earliest works originates from the idea of Markov Chains,
which are capable of learning item-item transition probabili-
ties, including FPMC [1]. FPMC [1] captures only the first-
order item transitions with low model complexity, assuming
the next preferred item is only correlated to the previous
interacted item. Fossil [19] extends FPMC to learn higher-
order item transitions and demonstrates the necessity of high-
order item transitions in SR.
The successful demonstration of sequential modeling from
deep learning inspires research potentials of sequential models
for SR, including Recurrent Neural Network (RNN) [2, 20,
21], Convolution Neural Network (CNN) [2, 22], and Trans-
former [4, 5, 6]. The representative work of RNN for SR is
GRU4Rec [23], which adopts the Gated Recurrent Unit (GRU)
in the session-based recommendation. Another line of SR is
CNN-based methods, such as Caser [22]. Caser [22] treats
the interaction sequence with item embeddings as an im-
age and applies convolution operators to learn local sub-
sequence structures. The recent success of self-attention-based
Transformer [3] architecture provides more possibilities in SR
due to its capability of modeling all pair-wise relationships
within the sequence, which is the limitation of RNN-based
methods and CNN methods. SASRec [4] is the first work
adopting the Transformer for SR and demonstrates its su-
periority. BERT4Rec [5] extends the SASRec to model bi-
directional relationships in Transformers, with the inspiration
of BERT [24]. TiSASRec [25] further incorporates time differ-
ence information in SASRec. FISSA [26] explores latent item
similarities in SR. DT4Rec [27] and STOSA [6] model items
as distributions instead of vector embedding and are state-of-
the-art SR methods with implicit feedback.
Despite the recent success of SR methods, they still fail to
incorporate heterogeneous item relationships into the modeling
of item-item transitions, especially in high-order transitions.
Distinctly, the proposed MT4SR can model both item-item
purchase transitions and additional item relationships in a
unified framework, which can be easily extended to various
numbers of relationships.
B. Item Relationships-aware Recommendation
Some methods propose to utilize extra item relation-
ships [28, 29] to enhance the representation capability of item
embeddings. For example, Chorus [30] specifically models
substitute and complementary relationships between items
in the continuous-time dynamic recommendation scenario.
RCF [18] proposes to model item relationships in a two-level
hierarchy in a graph learning framework. UGRec [31] extends
the idea of RCF and adopts the translation knowledge embed-
ding approach within the graph recommendation framework
to model both directed and undirected relationships for the
recommendation. MoHR [7] is the most relevant work to this
paper. MoHR incorporates item relationships into first-order
user-item translation scoring and proposes optimizing the next
relationship prediction, which can identify the importance of
each relationship in the dynamic sequence.
Although these methods significantly improve the recom-
mendation, they still obtain sub-optimal performance in rec-
ommendation and efficiency. Chorus can only handle substitute
and complementary relationships for sequential recommenda-
tion while more item relationships exist, and identifying the
significance of relationships is also crucial. RCF and UGRec
both rely on the graph modeling framework, which sometimes
requires a large amount of graphical memory due to the
exponential growth neighbors. Furthermore, neither RCF nor
UGRec can handle dynamic user preferences. MoHR only
models the first-order translation between user and item under
the relationship space [4].
C. Knowledge Graph Recommendation
Knowledge graph recommendation [8, 9, 32, 33, 34, 35]
originates from knowledge embeddings learning, where the
knowledge graph consists of the triplets describing entities
and their relationships. The classical line of knowledge graph
recommendation is embedding-based methods, which adopt
knowledge embedding techniques to learn entity and relation
embeddings, such as TransE [36], and DistMult [37]. The
representative work is CKE [38]. CKE utilizes TransE to learn
knowledge embeddings and regularizes the matrix factoriza-
tion. KTUP applies TransE to model both knowledge triplets
and user-item interactions. Another line of work is path-based
methods, in which RippleNet [39] is the representative work.
RippleNet starts paths from each user and aggregates item
embeddings with the path. The most state-of-the-art meth-
ods are based on collaborative knowledge graphs, including
KGAT [9] and KGIN [8]. Both KGAT and KGIN combine
the item knowledge graph and the user-item interaction graph
as a unified graph. KGAT applies TransE scores as attention
weights for node message aggregation. KGIN extends KGAT
by modeling paths as intents.
III. PRELIMINARIES
A. Problem Definition
Given a set of users Uand items V, and the associated in-
teractions, we first sort the interacted items of each user u∈ U
chronologically in a sequence as Su= [vu
1, vu
2, . . . , vu
|Su|],