KAST Knowledge Aware Adaptive Session Multi-Topic Network for Click-Through Rate Prediction Dike Sun1 Kai Liu2 ShengKai Yang2

2025-05-06 0 0 1.7MB 9 页 10玖币
侵权投诉
KAST: Knowledge Aware Adaptive Session Multi-Topic Network for
Click-Through Rate Prediction
Dike Sun1, Kai Liu2, ShengKai Yang2
1University of Electronic Science and Technology of China, Chengdu, China
2Alibaba Group, Hangzhou, China
sundike@std.uestc.edu.cn, baiyang.lk@alibaba-inc.com, shengkai.ysk@alibaba-inc.com
Abstract
Capturing the evolving trends of user interest is important
for both recommendation systems and advertising systems,
and user behavior sequences have been successfully used in
Click-Through-Rate(CTR) prediction problems. However, if
the user interest is learned on the basis of item-level behav-
iors, the performance may be affected by the following two
issues. Firstly, some casual outliers might be included in the
behavior sequences as user behaviors are likely to be diverse.
Secondly, the span of time intervals between user behaviors
is random and irregular, for which a RNN-based module em-
ployed from NLP is not perfectly adaptive. To handle these
two issues, we propose the Knowledge aware Adaptive Ses-
sion multi-Topic network(KAST). It can adaptively segment
user sessions from the whole user behavior sequence, and
maintain similar intents in the same session. Furthermore,
in order to improve the quality of session segmentation and
representation, a knowledge-aware module is introduced so
that the structural information from the user-item interaction
can be extracted in an end-to-end manner, and a marginal
based loss with these information is merged into the major
loss. Through extensive experiments on public benchmarks,
we demonstrate that KAST can achieve superior performance
than state-of-the-art methods for CTR prediction, and key
modules and hyper-parameters are also evaluated.
1 Introduction
Many deep-learning-based methods have been proposed to
model user behaviors and predict click-through rate(CTR)
in the raking stage of recommendation systems and ad-
vertising systems, which obtain better online results than
classic models, such as Logistic Regression(LR)(Kleinbaum
et al. 2002), Factorization Machines(FM)(Rendle 2010), etc.
Wide&Deep(Cheng et al. 2016), DeepFM(Guo et al. 2017),
Deep&Cross Net(DCN)(Wang et al. 2017) focuses on the
intersection of category features, and produced outstand-
ing results on large-scale sparse datasets in industry. In e-
commerce systems, users generate a large amount of be-
havior data everyday, which can be utilized to greatly en-
hance the prediction effect. Models such as DIN(Zhou et al.
2018) and DIEN(Zhou et al. 2019) learn the user inter-
est by integrating Natural Language Processing(NLP) tech-
niques, such as Attention Mechanism(Bahdanau, Cho, and
Copyright © 2021, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
(a) Divide time gap: 10mins (b) Divide time gap: 30mins
Figure 1: Percentage of incorrectly divided sessions, where
Item means only Item is used to judge the division, and Cate
means that is incorporated with Item. Others can be similar
explained.
Bengio 2014), LSTM(Hochreiter and Schmidhuber 1997),
GRU(Chung et al. 2014), etc. However, some casual outliers
might be included in the behavior sequences as user behav-
iors are likely to be diverse which reduces the effect of user
interest extraction.
User behavior sequence consists of a series of sessions,
and DSIN(Feng et al. 2019) pays attention into session-wise
topic representation, rather than fine-grained item-wise rep-
resentation. Meanwhile, the interest expressed within one
session usually focuses on one topic. For instance, when a
user wants to purchase a T-shirt, the topic of intent around
this time window may be relevant to sport gears or cloths.
In order to acquire high-quality interest topics, it is impor-
tant to guarantee high-quality session division. At present,
the common practice is that whenever a time gap of more
than 30 minutes emerges, a division would be made(Grbovic
and Cheng 2018). Although this method is simple and effi-
cient, some items on the borders of adjacent sessions are
likely to be mis-divided. Figure 1 depicts the percentage of
sessions that are incorrectly divided in Alibaba datasets 1
for CIKM2019 AnalytiCup, where the adjacent items have
the same values for features (category), (category, shop), and
(category, shop, brand). The time interval is set to 10 min-
utes and 30 minutes. It is shown that more than 4.2%(6%) of
the items are misclassified by this method. Moreover, even
if two items do not belong to the same category, shop or
brand, they may still fall into the same topic, which means
even more products are actually misclassified. Additionally,
it is problematic to determine the interval between sessions,
1http://www.cikm2019.net/challenge.html
arXiv:2210.03624v1 [cs.IR] 7 Oct 2022
because both explicit and implicit similarities are crucial to
the problem, and the latter ones are hard to be fully recog-
nized.
In this paper, we propose a method called Knowl-
edge aware Adaptive Session Multi-Topic Network(KAST),
which employs two key modules: Adaptive Session Seg-
mentation (ASS) module, and Knowledge aware Structure-
information Extraction (KSE) module. In the ASS module,
on the basis of the original session, by adaptively segment-
ing the user behavior sequence into sessions, the effect of
learning the session-level topic evolution can be enhanced,
and interference from casual behaviors among the sequence
is precluded, which the whole step is end-to-end. More-
over, the performance of ASS module depends on the quality
of the embedding matrix. Therefore, it is expected that the
more similar an item pair is in reality, the closer it will be in
the latent space. Hence, we employ KSE module, which in-
corporates into KAST to optimize the user and item embed-
ding matrices. The structural information between the users
and items on the graph, as novel knowledge for embedding
representation learning, is utilized in this process. In addi-
tion, the marginal based loss is merged into the major loss
function. In this way, the KSE module further assists ASS
module to accurately learn the topics of interest in session
level.
It should be noted that user behaviors in the sequence,
such as click or collection, are not equally spaced. How-
ever, the input sequence in RNN is supposed to be equally
spaced by default. Hence, having acquired the optimized
sessions, the pooling layer distills the session behaviors
into session-level vector representations. In this way, the
non-equality problem between items within each session no
longer jeopardizes the NLP conditions, and it is alleviated in
the session-level. Next, GRU is utilized to capture the evo-
lution of the topics, after which the attention layer is used
to weigh all the sessions by evaluating the correlation be-
tween the target item and each session. In the end, the final
representation is obtained.
To sum up, the main contributions of this paper are listed
as follows:
We propose the KAST network architecture, which cap-
tures the topic of interests from each adaptively-divided
session, and alleviates the problem of unequal spacing in
user behavior sequences.
We design an ASS module to update sessions automati-
cally, which uses the dynamic updating embedding matrix
to make the session division more reasonable, and reduces
more manual feature engineering as well.
In order to enhance the reliability of the ASS module,
we further design the KSE module. It is able to learn the
structural knowledge from the graph to improve the effect
of the embedding matrix.
We conduct extensive experiments to compare the pro-
posed method with many typical methods on the public
datasets, and we also evaluate the effect and robustness
for ASS and KSE modules. It is shown that the proposed
method obtains the state-of-art results on the CTR predic-
tion task.
2 Related Work
General Deep Models
Most deep networks are based on embedding and
multi-layer perceptron(Emb&MLP) structure, and
Wide&Deep(Cheng et al. 2016) combines the memory
ability of the linear part and the generalization ability of
the DNN part to improve the overall performance, which
constructs the basis for most of subsequent deep models.
Deep&Cross Net(DCN)(Wang et al. 2017) can explicitly
selects feature set to design higher-order feature crossing,
which avoids useless combined features. The ”cross”
net structure can effectively learn the bounded-degree
combined feature. Compared to Emb&MLP, PNN(Qu et al.
2016) designs the “product layer” after the embedding
to capture field-based second-order feature correlation.
AFM(Xiao et al. 2017) adds attention mechanism on the
basis of FM, which evaluates the importance of feature
interactions and reduce the impact of feature noise. Similar
to the Wide&Deep, DeepFM(Guo et al. 2017) is also jointly
trained by the shallow part and the deep part. The major
difference is that the LR is replaced by FM in the shallow
part, and FM is able to automatically learn cross features.
AutoInt(Song et al. 2019) uses multi-head self-attention
mechanism to perform automatic feature-crossing learning,
and therefore improves the accuracy of CTR prediction
tasks.
Sequence-based Deep Models
The user’s behavior sequence contains rich information,
which implies user’s interest trend. Therefore, modeling
behavior sequences can improve the accuracy of CTR
prediction. FPMC(Rendle, Freudenthaler, and Schmidt-
Thieme 2010) introduces a personalized transition ma-
trix based on Markov chains, which captures both time
information and long-term user preference information.
YoutubeDNN(Covington, Adams, and Sargin 2016) uses
average pooling to encode user behavior sequences into a
fixed-length vector to feed into MLP. DIN(Zhou et al. 2018)
learns the user’s historical behavior representation through
the attention mechanism. DIEN(Zhou et al. 2019) further
accomplishes efficient characterization of user behavior se-
quences by introducing auxiliary loss, and then uses AU-
GRU to capture the evolving trend of user interests. How-
ever, it should be pointed out that the time intervals of user
behavior sequences are not evenly spaced, so RNN-based
techniques are not perfectly suitable for this problem. SLi-
Rec(Yu et al. 2019) improves the structure of LSTM and
introduces time difference between the adjacent items to
model unequally-spaced behavior, which significantly im-
proves the performance.
Session-based Deep Models
In CTR prediction tasks, there are not many session-based
deep methods. GRU4REC(Hidasi et al. 2015) uses RNN for
session-based recommendation for the first time, and the
user’s click sequence is compressed by embedding for the
purpose of forming a continuous low-dimensional vector in-
put to GRU. After that, neural attentive recommendation ma-
摘要:

KAST:KnowledgeAwareAdaptiveSessionMulti-TopicNetworkforClick-ThroughRatePredictionDikeSun1,KaiLiu2,ShengKaiYang21UniversityofElectronicScienceandTechnologyofChina,Chengdu,China2AlibabaGroup,Hangzhou,Chinasundike@std.uestc.edu.cn,baiyang.lk@alibaba-inc.com,shengkai.ysk@alibaba-inc.comAbstractCapturin...

展开>> 收起<<
KAST Knowledge Aware Adaptive Session Multi-Topic Network for Click-Through Rate Prediction Dike Sun1 Kai Liu2 ShengKai Yang2.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:9 页 大小:1.7MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注