KAST Knowledge Aware Adaptive Session Multi-Topic Network for Click-Through Rate Prediction Dike Sun1 Kai Liu2 ShengKai Yang2

2025-05-06 0 0 1.7MB 9 页 10玖币

侵权投诉

KAST: Knowledge Aware Adaptive Session Multi-Topic Network for

Click-Through Rate Prediction

Dike Sun1, Kai Liu2, ShengKai Yang2

1University of Electronic Science and Technology of China, Chengdu, China

2Alibaba Group, Hangzhou, China

sundike@std.uestc.edu.cn, baiyang.lk@alibaba-inc.com, shengkai.ysk@alibaba-inc.com

Abstract

Capturing the evolving trends of user interest is important

for both recommendation systems and advertising systems,

and user behavior sequences have been successfully used in

Click-Through-Rate(CTR) prediction problems. However, if

the user interest is learned on the basis of item-level behav-

iors, the performance may be affected by the following two

issues. Firstly, some casual outliers might be included in the

behavior sequences as user behaviors are likely to be diverse.

Secondly, the span of time intervals between user behaviors

is random and irregular, for which a RNN-based module em-

ployed from NLP is not perfectly adaptive. To handle these

two issues, we propose the Knowledge aware Adaptive Ses-

sion multi-Topic network(KAST). It can adaptively segment

user sessions from the whole user behavior sequence, and

maintain similar intents in the same session. Furthermore,

in order to improve the quality of session segmentation and

representation, a knowledge-aware module is introduced so

that the structural information from the user-item interaction

can be extracted in an end-to-end manner, and a marginal

based loss with these information is merged into the major

loss. Through extensive experiments on public benchmarks,

we demonstrate that KAST can achieve superior performance

than state-of-the-art methods for CTR prediction, and key

modules and hyper-parameters are also evaluated.

1 Introduction

Many deep-learning-based methods have been proposed to

model user behaviors and predict click-through rate(CTR)

in the raking stage of recommendation systems and ad-

vertising systems, which obtain better online results than

classic models, such as Logistic Regression(LR)(Kleinbaum

et al. 2002), Factorization Machines(FM)(Rendle 2010), etc.

Wide&Deep(Cheng et al. 2016), DeepFM(Guo et al. 2017),

Deep&Cross Net(DCN)(Wang et al. 2017) focuses on the

intersection of category features, and produced outstand-

ing results on large-scale sparse datasets in industry. In e-

commerce systems, users generate a large amount of be-

havior data everyday, which can be utilized to greatly en-

hance the prediction effect. Models such as DIN(Zhou et al.

2018) and DIEN(Zhou et al. 2019) learn the user inter-

est by integrating Natural Language Processing(NLP) tech-

niques, such as Attention Mechanism(Bahdanau, Cho, and

(a) Divide time gap: 10mins (b) Divide time gap: 30mins

Figure 1: Percentage of incorrectly divided sessions, where

Item means only Item is used to judge the division, and Cate

means that is incorporated with Item. Others can be similar

explained.

Bengio 2014), LSTM(Hochreiter and Schmidhuber 1997),

GRU(Chung et al. 2014), etc. However, some casual outliers

might be included in the behavior sequences as user behav-

iors are likely to be diverse which reduces the effect of user

interest extraction.

User behavior sequence consists of a series of sessions,

and DSIN(Feng et al. 2019) pays attention into session-wise

topic representation, rather than ﬁne-grained item-wise rep-

resentation. Meanwhile, the interest expressed within one

session usually focuses on one topic. For instance, when a

user wants to purchase a T-shirt, the topic of intent around

this time window may be relevant to sport gears or cloths.

In order to acquire high-quality interest topics, it is impor-

tant to guarantee high-quality session division. At present,

the common practice is that whenever a time gap of more

than 30 minutes emerges, a division would be made(Grbovic

and Cheng 2018). Although this method is simple and efﬁ-

cient, some items on the borders of adjacent sessions are

likely to be mis-divided. Figure 1 depicts the percentage of

sessions that are incorrectly divided in Alibaba datasets 1

for CIKM2019 AnalytiCup, where the adjacent items have

the same values for features (category), (category, shop), and

(category, shop, brand). The time interval is set to 10 min-

utes and 30 minutes. It is shown that more than 4.2%(6%) of

the items are misclassiﬁed by this method. Moreover, even

if two items do not belong to the same category, shop or

brand, they may still fall into the same topic, which means

even more products are actually misclassiﬁed. Additionally,

it is problematic to determine the interval between sessions,

1http://www.cikm2019.net/challenge.html

arXiv:2210.03624v1 [cs.IR] 7 Oct 2022

because both explicit and implicit similarities are crucial to

the problem, and the latter ones are hard to be fully recog-

nized.

In this paper, we propose a method called Knowl-

edge aware Adaptive Session Multi-Topic Network(KAST),

which employs two key modules: Adaptive Session Seg-

mentation (ASS) module, and Knowledge aware Structure-

information Extraction (KSE) module. In the ASS module,

on the basis of the original session, by adaptively segment-

ing the user behavior sequence into sessions, the effect of

learning the session-level topic evolution can be enhanced,

and interference from casual behaviors among the sequence

is precluded, which the whole step is end-to-end. More-

over, the performance of ASS module depends on the quality

of the embedding matrix. Therefore, it is expected that the

more similar an item pair is in reality, the closer it will be in

the latent space. Hence, we employ KSE module, which in-

corporates into KAST to optimize the user and item embed-

ding matrices. The structural information between the users

and items on the graph, as novel knowledge for embedding

representation learning, is utilized in this process. In addi-

tion, the marginal based loss is merged into the major loss

function. In this way, the KSE module further assists ASS

module to accurately learn the topics of interest in session

level.

It should be noted that user behaviors in the sequence,

such as click or collection, are not equally spaced. How-

ever, the input sequence in RNN is supposed to be equally

spaced by default. Hence, having acquired the optimized

sessions, the pooling layer distills the session behaviors

into session-level vector representations. In this way, the

non-equality problem between items within each session no

longer jeopardizes the NLP conditions, and it is alleviated in

the session-level. Next, GRU is utilized to capture the evo-

lution of the topics, after which the attention layer is used

to weigh all the sessions by evaluating the correlation be-

tween the target item and each session. In the end, the ﬁnal

representation is obtained.

To sum up, the main contributions of this paper are listed

as follows:

• We propose the KAST network architecture, which cap-

tures the topic of interests from each adaptively-divided

session, and alleviates the problem of unequal spacing in

user behavior sequences.

• We design an ASS module to update sessions automati-

cally, which uses the dynamic updating embedding matrix

to make the session division more reasonable, and reduces

more manual feature engineering as well.

• In order to enhance the reliability of the ASS module,

we further design the KSE module. It is able to learn the

structural knowledge from the graph to improve the effect

of the embedding matrix.

• We conduct extensive experiments to compare the pro-

posed method with many typical methods on the public

datasets, and we also evaluate the effect and robustness

for ASS and KSE modules. It is shown that the proposed

method obtains the state-of-art results on the CTR predic-

tion task.

2 Related Work

General Deep Models

Most deep networks are based on embedding and

multi-layer perceptron(Emb&MLP) structure, and

Wide&Deep(Cheng et al. 2016) combines the memory

ability of the linear part and the generalization ability of

the DNN part to improve the overall performance, which

constructs the basis for most of subsequent deep models.

Deep&Cross Net(DCN)(Wang et al. 2017) can explicitly

selects feature set to design higher-order feature crossing,

which avoids useless combined features. The ”cross”

net structure can effectively learn the bounded-degree

combined feature. Compared to Emb&MLP, PNN(Qu et al.

2016) designs the “product layer” after the embedding

to capture ﬁeld-based second-order feature correlation.

AFM(Xiao et al. 2017) adds attention mechanism on the

basis of FM, which evaluates the importance of feature

interactions and reduce the impact of feature noise. Similar

to the Wide&Deep, DeepFM(Guo et al. 2017) is also jointly

trained by the shallow part and the deep part. The major

difference is that the LR is replaced by FM in the shallow

part, and FM is able to automatically learn cross features.

AutoInt(Song et al. 2019) uses multi-head self-attention

mechanism to perform automatic feature-crossing learning,

and therefore improves the accuracy of CTR prediction

tasks.

Sequence-based Deep Models

The user’s behavior sequence contains rich information,

which implies user’s interest trend. Therefore, modeling

behavior sequences can improve the accuracy of CTR

prediction. FPMC(Rendle, Freudenthaler, and Schmidt-

Thieme 2010) introduces a personalized transition ma-

trix based on Markov chains, which captures both time

information and long-term user preference information.

YoutubeDNN(Covington, Adams, and Sargin 2016) uses

average pooling to encode user behavior sequences into a

ﬁxed-length vector to feed into MLP. DIN(Zhou et al. 2018)

learns the user’s historical behavior representation through

the attention mechanism. DIEN(Zhou et al. 2019) further

accomplishes efﬁcient characterization of user behavior se-

quences by introducing auxiliary loss, and then uses AU-

GRU to capture the evolving trend of user interests. How-

ever, it should be pointed out that the time intervals of user

behavior sequences are not evenly spaced, so RNN-based

techniques are not perfectly suitable for this problem. SLi-

Rec(Yu et al. 2019) improves the structure of LSTM and

introduces time difference between the adjacent items to

model unequally-spaced behavior, which signiﬁcantly im-

proves the performance.

Session-based Deep Models

In CTR prediction tasks, there are not many session-based

deep methods. GRU4REC(Hidasi et al. 2015) uses RNN for

session-based recommendation for the ﬁrst time, and the

user’s click sequence is compressed by embedding for the

purpose of forming a continuous low-dimensional vector in-

put to GRU. After that, neural attentive recommendation ma-

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

KAST:KnowledgeAwareAdaptiveSessionMulti-TopicNetworkforClick-ThroughRatePredictionDikeSun1,KaiLiu2,ShengKaiYang21UniversityofElectronicScienceandTechnologyofChina,Chengdu,China2AlibabaGroup,Hangzhou,Chinasundike@std.uestc.edu.cn,baiyang.lk@alibaba-inc.com,shengkai.ysk@alibaba-inc.comAbstractCapturin...

展开>> 收起<<

KAST Knowledge Aware Adaptive Session Multi-Topic Network for Click-Through Rate Prediction Dike Sun1 Kai Liu2 ShengKai Yang2.pdf

共9页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

KAST Knowledge Aware Adaptive Session Multi-Topic Network for Click-Through Rate Prediction Dike Sun1 Kai Liu2 ShengKai Yang2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: