Automatic Scene-based Topic Channel Construction System for E-Commerce Peng Lin Yanyan Zou Lingfei Wu Mian Ma Zhuoye Ding Bo Long

2025-05-02 0 0 2.56MB 13 页 10玖币

侵权投诉

Automatic Scene-based Topic Channel Construction System for

E-Commerce

Peng Lin∗, Yanyan Zou∗

, Lingfei Wu, Mian Ma, Zhuoye Ding, Bo Long

JD.com, Beijing, China

{linpeng47,zouyanyan6,lingfei.wu,mamian,dingzhuoye,bo.long}@jd.com

Abstract

Scene marketing that well demonstrates user

interests within a certain scenario has proved

effective for ofﬂine shopping. To conduct

scene marketing for e-commerce platforms,

this work presents a novel product form, scene-

based topic channel which typically consists of

a list of diverse products belonging to the same

usage scenario and a topic title that describes

the scenario with marketing words. As manual

construction of channels is time-consuming

due to billions of products as well as dynamic

and diverse customers’ interests, it is necessary

to leverage AI techniques to automatically con-

struct channels for certain usage scenarios and

even discover novel topics. To be speciﬁc, we

ﬁrst frame the channel construction task as a

two-step problem, i.e., scene-based topic gen-

eration and product clustering, and propose an

E-commerce Scene-based Topic Channel con-

struction system (i.e., ESTC) to achieve auto-

mated production, consisting of scene-based

topic generation model for the e-commerce do-

main, product clustering on the basis of topic

similarity, as well as quality control based on

automatic model ﬁltering and human screen-

ing. Extensive ofﬂine experiments and online

A/B test validates the effectiveness of such a

novel product form as well as the proposed sys-

tem. In addition, we also introduce the experi-

ence of deploying the proposed system on a

real-world e-commerce recommendation plat-

form.

1 Introduction

Recently, e-commerce platforms have become an

indispensable part of people’s daily life. Differ-

ent from brick-and-mortar stores where salesper-

sons can hold face-to-face conversations to pro-

mote products and even recommend more prod-

ucts related to customers’ interests, most recom-

mendation systems of e-commerce platforms, such

∗

The ﬁrst two authors made equal contributions. Corre-

spond to Yanyan Zou.

as Taobao

, mainly display individual products in

which users might be interested (Zhou et al.,2018,

2019), as listed in Figure 1(indicated as Recom-

mendation Flow Page). Recently, scene marketing

has become a new marketing mode for product pro-

motion where particular application scenarios (i.e.,

scene) are created to demonstrate product func-

tions and highlight features correspondingly (Zhao,

2020), which is also paramount for e-commerce

platforms to improve user experience during online

shopping (Kang et al.,2019;Fu et al.,2019). A

practical usage scenario of products can help users

better understand product functions and features,

and also allow the platform to exhibit more prod-

ucts that hit customer’s speciﬁc interests, so that the

user experience and click rate might be improved.

However, scenes do not always help. For exam-

ple, displaying all related products belonging to

the same scene in the recommendation ﬂow page

might harm the user experience, since they tend to

be homogeneous.

To achieve scene marketing in e-commerce plat-

forms, this work presents a novel product form,

scene-based topic channel, which consists of a list

of diverse products belonging to the same scenario,

together with two short phrases (or sentences) as

the topic title summarizing the scene. Exempliﬁed

by Figure 1, one primary product of a channel and

the associated scene topic title (highlighted with

red box) are displayed in the recommendation ﬂow

page. If a user is interested in the primary prod-

uct and clicks on it, the user is then redirected to

the topic channel page where diverse products be-

longing to the same usage scenario are displayed.

Existing ways to constructing scene-based topic

channel mainly rely on expert knowledge and past

experience of business operators in grouping prod-

ucts into different functional categories with certain

scene topics (Mansell,2002;Cooke and Leydes-

dorff,2006;Fernandez-Lopez and Corcho,2010).

1https://www.taobao.com/

arXiv:2210.02643v2 [cs.CL] 30 Oct 2022

Recommendation Flow Page Scene-base Topic Channel Translation

Top Left Product: Beijing Hot

Wilderness Forest Holiday

Camp

Top Right Product: Titanium

BBQ Tongs Tweezer for

Barbecue

Bottom Left Product: Folding

Collaspsible Wagon Utility

Outdoor Camping Cart

Bottom Right Product: Full-

Automatic 3-4 Person

Outdoor Camping Tent

Topic Title: Road Trip

Solve Troubles In Trip

Figure 1: A screenshot of a scene-based topic channel on an e-commerce platform, with only four products due to

limited space. Text with underline in the right-side “Translation” column are used to connect the translated words

with associated parts in the topic channel.

However, such methods are highly expensive with

low efﬁciency and even impractical since there

are billions of products in the e-commerce plat-

forms. Therefore, in this work, we propose an

E-commerce Scene-based Topic Channel construc-

tion system (i.e., ESTC) to automatically construct

such scene-based topic channels, where the task

is framed as a two-step problem, i.e., scene-based

topic generation and product clustering. One intu-

itive solution to obtaining scene topics is to make

use of topic models (Blei et al.,2003;Roberts et al.,

2013;Grootendorst,2022) or techniques from ex-

tractive summarization (Basave et al.,2014;Wan

and Wang,2016), which are, however, restricted

to assigning topics within a predeﬁned limited can-

didate set, while there are often emerging scenes

in the e-commerce ﬁelds. Thus, like Alokaili et al.

(2020), we propose to generate scene-based topic

titles for products, which allows to create novel

topics not featured in the training set.

Nevertheless, in practice, the limitation of la-

beled data for training (around 5000 instances) hin-

ders the generation quality of the model. On the

other hand, we observe that generated topic titles,

describing the same scenario, might be slightly dif-

ferent in formulation. Simply grouping products

based on exact string match of generated topic titles

results in channels with rare products. To address

above issues, we ﬁrst develop a pre-trained model

in the e-commerce ﬁeld to improve generation qual-

ity. Then, a semantic similarity based clustering

method is designed to conduct product clustering

to form the channel. Finally, to ensure the user

experience online, we further design a quality con-

trol module to strictly ﬁlter out undesired channels,

such as inconsistent topic titles, or channels with

irrelevant topic-product pairs. Our contributions

are summarized as follows:

•

A topic generation model in e-commerce ﬁeld

is proposed to generate scene-based topic ti-

tles for products, which is ﬂexible to produce

topics for emerging products and allows the

system to discover novel scene topics.

•

A semantic similarity based clustering method

is designed to aggregate products with similar

topic titles and form scene-based channels,

which is able to improve the product diversity.

•

A quality control module is designed to en-

sure the quality of the artiﬁcially constructed

channels before they are released online.

•

We introduce the overall architecture of the

deployed system where the ESTC has been

successfully implemented into a real-world

e-commerce platform.

•

To the best of our knowledge, this is the

ﬁrst work on automatically constructing scene-

based topic channel for scene marketing in

e-commerce platforms.

2 Proposed Method

The development of the proposed ESTC system

consists of three main parts, including scene-based

topic generation for each product, scene-based

product clustering to aggregate products with simi-

lar topic titles, as well as the quality control module

to ensure the quality of AI-generated channels. We

also include a simple data augmentation module

to discover weakly supervised data in order to im-

prove the diversity of generated topic titles.

2.1 Scene-based Topic Generation

In this work, we propose to generate the scene-

based topic titles for each product. To be speciﬁc,

given input information

X= (x1, x2, . . . , x|X|)

a product

, including product’s title

, a set of at-

tributes

and side information

obtained through

optical character recognition techniques, paired

with scene-based topic title

Y= (y1, y2, . . . , y|Y|)

we aim to learn model parameters

and estimate

the conditional probability:

P(Y|X;θ) =

|Y|

t=1

p(yt|y<t;X;θ)

where

y<t

stands for all tokens in a scene title be-

fore position t(i.e., y<t = (y1, y2, . . . , yt−1)).

Pretraining with E-commerce Corpus

Pre-

trained models (Radford et al.,2019;Devlin et al.,

2019;Lewis et al.,2020;Raffel et al.,2020;Zou

et al.,2020;Xue et al.,2021) have proved effective

in many downstream tasks, however, most of which

are developed on English corpora from general do-

mains, such as news articles, books, stories and

web text. In our scenario, we aim to produce topic

titles in Chinese that summarize certain usage sce-

narios of products. Therefore, a model is required

to understand the products through its associated in-

formation (such as title, semi-structured attributes)

and generate scene-based topic titles, where we

argue that the model should learn knowledge from

e-commerce ﬁelds and thus propose to further pre-

train models in domain (Gururangan et al.,2020).

Speciﬁcally, besides the product title, attribute set

as well as side information, we also collect the

corresponding advertising copywriting of products

from e-commerce platforms for the second phase

of pre-training. We adopt the UniLM (Dong et al.,

2019) with BERT initialization as backbone struc-

ture.

Recall that the product attributes

is a set with-

out ﬁxed order. We observe that input containing

same attributes yet in different orders might results

in different outputs. On the other hand, UniLM is

an encoder-decoder shared architecture. To rein-

force both the understanding and generation ability

of no-order input information, in addition to the

original pre-training objectives of UniLM, we also

propose two objectives to adapt the target domain:

•

Consistency Classiﬁcation: Given a product

title-attributes pair, this task aims to classify

if the two refer to the same product. For the

positive example, the attributes and the title

describe the same product and attributes are

randomly concatenated as a sequence to in-

troduce disorder noises. For the negative ex-

ample, we randomly select attributes from an-

other different product.

•

Sentence Reordering: We split the product

copywriting into pieces according to marks

(such as comma and period). Such pieces

are then shufﬂed and concatenated as a new

text sequence. The model takes the shufﬂed

sequence as input and learns to generate the

original copywriting.

After the second phase of pre-training in the target

e-commerce domain, we ﬁne-tune the pre-trained

model on the scene-based topic generation dataset.

2.2 Scene-based Product Clustering

One intuitive solution to constructing a scene-based

topic channel is to group products with exactly

the same generated topic titles. However, we ob-

serve there exists channels with similar topic titles,

each of which merely contains several products,

while we expect one channel has diverse products

to ensure user experience. Therefore, we design a

clustering module to aggregate products with se-

mantically similar topic titles.

Topic Encoding

To better learn scene-based

topic representations and distinguish different topic

titles, we take all topic titles from training set as in-

put and employ the SimCSE (Gao et al.,2021)

to further ﬁne-tune the e-commerce pre-trained

UniLM model in an unsupervised fashion. The

embeddings of the last layer are used as the initial-

ization for product clustering.

Product Clustering

This module aims to group

products with semantically similar topic titles into

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AutomaticScene-basedTopicChannelConstructionSystemforE-CommercePengLin,YanyanZou,LingfeiWu,MianMa,ZhuoyeDing,BoLongJD.com,Beijing,China{linpeng47,zouyanyan6,lingfei.wu,mamian,dingzhuoye,bo.long}@jd.comAbstractScenemarketingthatwelldemonstratesuserinterestswithinacertainscenariohasprovedeffectivefo...

展开>> 收起<<

Automatic Scene-based Topic Channel Construction System for E-Commerce Peng Lin Yanyan Zou Lingfei Wu Mian Ma Zhuoye Ding Bo Long.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Automatic Scene-based Topic Channel Construction System for E-Commerce Peng Lin Yanyan Zou Lingfei Wu Mian Ma Zhuoye Ding Bo Long

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: