Automatic Scene-based Topic Channel Construction System for E-Commerce Peng Lin Yanyan Zou Lingfei Wu Mian Ma Zhuoye Ding Bo Long

2025-05-02 0 0 2.56MB 13 页 10玖币
侵权投诉
Automatic Scene-based Topic Channel Construction System for
E-Commerce
Peng Lin, Yanyan Zou
, Lingfei Wu, Mian Ma, Zhuoye Ding, Bo Long
JD.com, Beijing, China
{linpeng47,zouyanyan6,lingfei.wu,mamian,dingzhuoye,bo.long}@jd.com
Abstract
Scene marketing that well demonstrates user
interests within a certain scenario has proved
effective for offline shopping. To conduct
scene marketing for e-commerce platforms,
this work presents a novel product form, scene-
based topic channel which typically consists of
a list of diverse products belonging to the same
usage scenario and a topic title that describes
the scenario with marketing words. As manual
construction of channels is time-consuming
due to billions of products as well as dynamic
and diverse customers’ interests, it is necessary
to leverage AI techniques to automatically con-
struct channels for certain usage scenarios and
even discover novel topics. To be specific, we
first frame the channel construction task as a
two-step problem, i.e., scene-based topic gen-
eration and product clustering, and propose an
E-commerce Scene-based Topic Channel con-
struction system (i.e., ESTC) to achieve auto-
mated production, consisting of scene-based
topic generation model for the e-commerce do-
main, product clustering on the basis of topic
similarity, as well as quality control based on
automatic model filtering and human screen-
ing. Extensive offline experiments and online
A/B test validates the effectiveness of such a
novel product form as well as the proposed sys-
tem. In addition, we also introduce the experi-
ence of deploying the proposed system on a
real-world e-commerce recommendation plat-
form.
1 Introduction
Recently, e-commerce platforms have become an
indispensable part of people’s daily life. Differ-
ent from brick-and-mortar stores where salesper-
sons can hold face-to-face conversations to pro-
mote products and even recommend more prod-
ucts related to customers’ interests, most recom-
mendation systems of e-commerce platforms, such
The first two authors made equal contributions. Corre-
spond to Yanyan Zou.
as Taobao
1
, mainly display individual products in
which users might be interested (Zhou et al.,2018,
2019), as listed in Figure 1(indicated as Recom-
mendation Flow Page). Recently, scene marketing
has become a new marketing mode for product pro-
motion where particular application scenarios (i.e.,
scene) are created to demonstrate product func-
tions and highlight features correspondingly (Zhao,
2020), which is also paramount for e-commerce
platforms to improve user experience during online
shopping (Kang et al.,2019;Fu et al.,2019). A
practical usage scenario of products can help users
better understand product functions and features,
and also allow the platform to exhibit more prod-
ucts that hit customer’s specific interests, so that the
user experience and click rate might be improved.
However, scenes do not always help. For exam-
ple, displaying all related products belonging to
the same scene in the recommendation flow page
might harm the user experience, since they tend to
be homogeneous.
To achieve scene marketing in e-commerce plat-
forms, this work presents a novel product form,
scene-based topic channel, which consists of a list
of diverse products belonging to the same scenario,
together with two short phrases (or sentences) as
the topic title summarizing the scene. Exemplified
by Figure 1, one primary product of a channel and
the associated scene topic title (highlighted with
red box) are displayed in the recommendation flow
page. If a user is interested in the primary prod-
uct and clicks on it, the user is then redirected to
the topic channel page where diverse products be-
longing to the same usage scenario are displayed.
Existing ways to constructing scene-based topic
channel mainly rely on expert knowledge and past
experience of business operators in grouping prod-
ucts into different functional categories with certain
scene topics (Mansell,2002;Cooke and Leydes-
dorff,2006;Fernandez-Lopez and Corcho,2010).
1https://www.taobao.com/
arXiv:2210.02643v2 [cs.CL] 30 Oct 2022
Recommendation Flow Page Scene-base Topic Channel Translation
Top Left Product: Beijing Hot
Wilderness Forest Holiday
Camp
Top Right Product: Titanium
BBQ Tongs Tweezer for
Barbecue
Bottom Left Product: Folding
Collaspsible Wagon Utility
Outdoor Camping Cart
Bottom Right Product: Full-
Automatic 3-4 Person
Outdoor Camping Tent
Topic Title: Road Trip
Solve Troubles In Trip
Figure 1: A screenshot of a scene-based topic channel on an e-commerce platform, with only four products due to
limited space. Text with underline in the right-side “Translation” column are used to connect the translated words
with associated parts in the topic channel.
However, such methods are highly expensive with
low efficiency and even impractical since there
are billions of products in the e-commerce plat-
forms. Therefore, in this work, we propose an
E-commerce Scene-based Topic Channel construc-
tion system (i.e., ESTC) to automatically construct
such scene-based topic channels, where the task
is framed as a two-step problem, i.e., scene-based
topic generation and product clustering. One intu-
itive solution to obtaining scene topics is to make
use of topic models (Blei et al.,2003;Roberts et al.,
2013;Grootendorst,2022) or techniques from ex-
tractive summarization (Basave et al.,2014;Wan
and Wang,2016), which are, however, restricted
to assigning topics within a predefined limited can-
didate set, while there are often emerging scenes
in the e-commerce fields. Thus, like Alokaili et al.
(2020), we propose to generate scene-based topic
titles for products, which allows to create novel
topics not featured in the training set.
Nevertheless, in practice, the limitation of la-
beled data for training (around 5000 instances) hin-
ders the generation quality of the model. On the
other hand, we observe that generated topic titles,
describing the same scenario, might be slightly dif-
ferent in formulation. Simply grouping products
based on exact string match of generated topic titles
results in channels with rare products. To address
above issues, we first develop a pre-trained model
in the e-commerce field to improve generation qual-
ity. Then, a semantic similarity based clustering
method is designed to conduct product clustering
to form the channel. Finally, to ensure the user
experience online, we further design a quality con-
trol module to strictly filter out undesired channels,
such as inconsistent topic titles, or channels with
irrelevant topic-product pairs. Our contributions
are summarized as follows:
A topic generation model in e-commerce field
is proposed to generate scene-based topic ti-
tles for products, which is flexible to produce
topics for emerging products and allows the
system to discover novel scene topics.
A semantic similarity based clustering method
is designed to aggregate products with similar
topic titles and form scene-based channels,
which is able to improve the product diversity.
A quality control module is designed to en-
sure the quality of the artificially constructed
channels before they are released online.
We introduce the overall architecture of the
deployed system where the ESTC has been
successfully implemented into a real-world
e-commerce platform.
To the best of our knowledge, this is the
first work on automatically constructing scene-
based topic channel for scene marketing in
e-commerce platforms.
2 Proposed Method
The development of the proposed ESTC system
consists of three main parts, including scene-based
topic generation for each product, scene-based
product clustering to aggregate products with simi-
lar topic titles, as well as the quality control module
to ensure the quality of AI-generated channels. We
also include a simple data augmentation module
to discover weakly supervised data in order to im-
prove the diversity of generated topic titles.
2.1 Scene-based Topic Generation
In this work, we propose to generate the scene-
based topic titles for each product. To be specific,
given input information
X= (x1, x2, . . . , x|X|)
of
a product
P
, including product’s title
T
, a set of at-
tributes
A
and side information
O
obtained through
optical character recognition techniques, paired
with scene-based topic title
Y= (y1, y2, . . . , y|Y|)
,
we aim to learn model parameters
θ
and estimate
the conditional probability:
P(Y|X;θ) =
|Y|
Y
t=1
p(yt|y<t;X;θ)
where
y<t
stands for all tokens in a scene title be-
fore position t(i.e., y<t = (y1, y2, . . . , yt1)).
Pretraining with E-commerce Corpus
Pre-
trained models (Radford et al.,2019;Devlin et al.,
2019;Lewis et al.,2020;Raffel et al.,2020;Zou
et al.,2020;Xue et al.,2021) have proved effective
in many downstream tasks, however, most of which
are developed on English corpora from general do-
mains, such as news articles, books, stories and
web text. In our scenario, we aim to produce topic
titles in Chinese that summarize certain usage sce-
narios of products. Therefore, a model is required
to understand the products through its associated in-
formation (such as title, semi-structured attributes)
and generate scene-based topic titles, where we
argue that the model should learn knowledge from
e-commerce fields and thus propose to further pre-
train models in domain (Gururangan et al.,2020).
Specifically, besides the product title, attribute set
as well as side information, we also collect the
corresponding advertising copywriting of products
from e-commerce platforms for the second phase
of pre-training. We adopt the UniLM (Dong et al.,
2019) with BERT initialization as backbone struc-
ture.
Recall that the product attributes
A
is a set with-
out fixed order. We observe that input containing
same attributes yet in different orders might results
in different outputs. On the other hand, UniLM is
an encoder-decoder shared architecture. To rein-
force both the understanding and generation ability
of no-order input information, in addition to the
original pre-training objectives of UniLM, we also
propose two objectives to adapt the target domain:
Consistency Classification: Given a product
title-attributes pair, this task aims to classify
if the two refer to the same product. For the
positive example, the attributes and the title
describe the same product and attributes are
randomly concatenated as a sequence to in-
troduce disorder noises. For the negative ex-
ample, we randomly select attributes from an-
other different product.
Sentence Reordering: We split the product
copywriting into pieces according to marks
(such as comma and period). Such pieces
are then shuffled and concatenated as a new
text sequence. The model takes the shuffled
sequence as input and learns to generate the
original copywriting.
After the second phase of pre-training in the target
e-commerce domain, we fine-tune the pre-trained
model on the scene-based topic generation dataset.
2.2 Scene-based Product Clustering
One intuitive solution to constructing a scene-based
topic channel is to group products with exactly
the same generated topic titles. However, we ob-
serve there exists channels with similar topic titles,
each of which merely contains several products,
while we expect one channel has diverse products
to ensure user experience. Therefore, we design a
clustering module to aggregate products with se-
mantically similar topic titles.
Topic Encoding
To better learn scene-based
topic representations and distinguish different topic
titles, we take all topic titles from training set as in-
put and employ the SimCSE (Gao et al.,2021)
to further fine-tune the e-commerce pre-trained
UniLM model in an unsupervised fashion. The
embeddings of the last layer are used as the initial-
ization for product clustering.
Product Clustering
This module aims to group
products with semantically similar topic titles into
摘要:

AutomaticScene-basedTopicChannelConstructionSystemforE-CommercePengLin,YanyanZou,LingfeiWu,MianMa,ZhuoyeDing,BoLongJD.com,Beijing,China{linpeng47,zouyanyan6,lingfei.wu,mamian,dingzhuoye,bo.long}@jd.comAbstractScenemarketingthatwelldemonstratesuserinterestswithinacertainscenariohasprovedeffectivefo...

展开>> 收起<<
Automatic Scene-based Topic Channel Construction System for E-Commerce Peng Lin Yanyan Zou Lingfei Wu Mian Ma Zhuoye Ding Bo Long.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:2.56MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注