Automatic Generation of Product Concepts from Positive Examples with an Application to Music Streaming

2025-05-02 0 0 1.61MB 18 页 10玖币
侵权投诉
Automatic Generation of Product Concepts
from Positive Examples, with an Application to
Music Streaming
Kshitij Goyal1, Wannes Meert1, Hendrik Blockeel1, Elia Van Wolputte1, Koen
Vanderstraeten2, Wouter Pijpops2, and Kurt Jaspers2
1KU Leuven, Belgium
2Tunify, Belgium
Abstract. Internet based businesses and products (e.g. e-commerce,
music streaming) are becoming more and more sophisticated every day
with a lot of focus on improving customer satisfaction. A core way they
achieve this is by providing customers with an easy access to their prod-
ucts by structuring them in catalogues using navigation bars and provid-
ing recommendations. We refer to these catalogues as product concepts,
e.g. product categories on e-commerce websites, public playlists on music
streaming platforms. These product concepts typically contain products
that are linked with each other through some common features (e.g. a
playlist of songs by the same artist). How they are defined in the back-
end of the system can be different for different products. In this work,
we represent product concepts using database queries and tackle two
learning problems. First, given sets of products that all belong to the
same unknown product concept, we learn a database query that is a
representation of this product concept. Second, we learn product con-
cepts and their corresponding queries when the given sets of products
are associated with multiple product concepts. To achieve these goals,
we propose two approaches that combine the concepts of PU learning
with Decision Trees and Clustering. Our experiments demonstrate, via
a simulated setup for a music streaming service, that our approach is
effective in solving these problems.
Keywords: PU Learning ·Machine Learning ·Music Streaming
1 Introduction
Machine learning is used for various applications these days and more and more
businesses are looking to use machine learning to improve their products. Re-
cent advances in online consumer based businesses provide an opportunity to
explore machine learning solutions to challenging problems. One such problem
is generating dynamic product concepts that contain a selection of products that
are linked with each other through some common features. For example, in e-
commerce a product category that contains similar items (e.g., ‘cosmetic items’)
is a product concept, in music streaming services a public playlist is a product
arXiv:2210.01515v3 [cs.LG] 15 Jan 2023
2 K. Goyal et al.
concept. Typically these kinds of product concepts are available for users to se-
lect from. There are benefits of having such product concepts in your system: 1.
they provide a good way to structure the itinerary; 2. a user can use them to
navigate the website; 3. they make it easy for users to discover new items.
Due to the lack of transparency behind how these product concepts are cre-
ated for most services, we can not generalize their creation in the back-end of
a system. However, in our work, we assume that a product concept is associ-
ated with a database query, which we term as the concept query, that filters
that whole database of items (products, songs etc.) based on certain common
features. This makes these product concepts dynamic in nature: they get auto-
matically updated when new items are added to the database. This definition is
inspired by our use case of a music streaming company called Tunify3.
Tunify is a music streaming service that provides a predefined selection of
playlists to businesses. Tunify has a database of songs where each song is rep-
resented by a fixed set of discrete valued features: mood, popularity etc. Tunify
also maintains a set of database queries that define useful product concepts
(the product concept defined by a query is a set of all songs that are returned
when that query is run on the database). These products concepts are useful
for generating playlists for businesses. A business can select a product concept
based on a small description (e.g., 80’s Rock) and a playlist based on the se-
lected product concept is generated, the generated playlist is a sample of all the
songs associated with the product concept. The database queries corresponding
to product concepts are manually defined by music experts that Tunify employs.
This query creation process has an obvious drawback: it requires a lot of time to
fine tune the exact feature-values the query should contain. This motivates our
first problem: can we automatically identify the database query corresponding
to a product concept if we are provided with a set of playlists that the target
product concept should generate? We argue that it is easier for an expert to
manually create playlists that the target concept should generate compared to
manually creating a database query.
Another interesting problem setting is when the provided playlists come from
multiple target product concepts. Can we identify different concepts the playlists
are coming from and their corresponding database queries? This problem is mo-
tivated by the fact that Tunify allows for customers to create their own playlists
under one of their subscriptions. As these playlists come from multiple customers,
we expect that: 1. not all of them contain similar songs; 2. there are multiple
playlists that contain similar songs. We want to identify similar playlists and
create product concepts based on them. There are two outcomes of this: either
the identified product concepts are missing from Tunify’s system, or the iden-
tified product concepts are already in the system but the customers didn’t use
them for whatever reasons. In the former case, we improve the database with
new concepts, and in the later case, Tunify could reach out to the customers
that created their own playlists and recommend the already existing concepts.
3https://www.tunify.com/nl-be/language/
Automatic Generation of Product Concepts from Positive Examples 3
Even though we motivate these problems based on our use case of Tunify,
these problems are generally applicable for any business where such product
concepts are used. Consider the example of an e-commerce company, here the
product concepts are the product categories (e.g., cosmetics, menswear). The
customers interact with the product concept in this case differently from Tunify:
in Tunify, a playlist is generated when a customer selects a product concept,
but in the case of e-commerce a customer can view all the products associated
with a product concept. In the context of our two learning problems, however,
this difference does not have any impact. For the first learning problem, instead
of experts making a playlist of songs, the experts create a set of items. For the
second problem, instead of using playlists of songs created by customers, we can
use the items the customers purchased together.
To summarize, we consider two learning problems: 1. learning the product
concept query given a collection of itemsets that are associated with it, 2. learn-
ing product concepts and their corresponding queries given a collection of item-
sets that may or may not be associated with a single product concept. For the
first problem, we use a combination of PU learning (Positive and Unlabelled
Learning) [4] techniques with decision tree learning to learn the product concept
queries as the rules from the decision tree. In addition to this, we also study the
effect of noise in the provided itemsets on the final query and propose a way to
deal with it. For the second problem, we combine the approach of task one with
clustering to identify new product concepts from data generated by customers.
We design and test our experiments on the dataset provided to us by Tunify.
With a simulated experimental setup, we demonstrate that our approaches are
able to learn good quality concept queries with small number of items, even when
there is noise in the set, and we are able to effectively identify product concepts
using the customer data. We additionally show that our proposed algorithm for
the first problem is robust to noise in the provided set of items.
The paper is structured in the following way: first we introduce some termi-
nologies and explain the problem statements in section 2, secondly we present
our approach in section 3, then we present the experimental results in section 4
before the related works and a discussion in sections 5 and 6 respectively. Section
7 concludes.
2 Framework
In this section, we first give an overview of some concepts from logic and satis-
fiability which we use in our work before explaining our problem statement.
2.1 Propositional Logic
Propositional Logic formulas contain literals which are Boolean formulas, their
negation and logical connectives, e.g., pa_ pb^ pcqqq. An assignment xof vari-
ables ta, b, cusatisfies a formula φif xmakes the formula φTrue. Any logical
formula can be rewritten in a normal form such as Conjunctive Normal Form
4 K. Goyal et al.
(CNF) or Disjunctive Normal Form (DNF). A CNF formula consists of con-
junction of disjunction of literals and a DNF formula contains disjunction of
conjunction of literals, where conjunction is the logical ‘AND’ (^) operator and
disjunction is the logical ‘OR’ (_) operator.
2.2 Problem Statement
We now formally define our learning problems. The dataset of instances is repre-
sented by D. We assume that an instance is represented by a fixed set of discrete
valued features Fand takes a single value for each feature. For a feature fPF,
the discrete set of values fcan take is represented by Vpfq.
Definition 1. Product Concept. A product concept is a collection of instances.
A product concept is associated with a concept query that defines which instances
belong to it.
A product concept can be a union of multiple ‘sub-concepts’. For example,
in e-commerce, a category can have many different sub categories; in Tunify,
there are a number of product concepts that combine the music from multiple
different product concepts to generate a playlist that contains songs from all
the combined product concepts (e.g., ‘Fitness Center’ product concept combines
product concepts ‘Dance Workout’ and ‘Rock Dynamic’). Keeping this in mind,
we formally define a concept query as follows:
Definition 2. Conjunctive Concept Query. Given a set of attributes FĎF
and sets of values VfĎVpfqfor each fPF. A conjunctive concept query Q,
for an arbitrary input xPD, is defined as the following rule-based query in a
conjunctive normal form:
Q:ľ
fPF
ł
vPVf
pxfvq
Definition 3. Concept Query. A concept query is defined as:
1. A conjunctive concept query is a concept query.
2. A disjunction of two or more conjunctive concept queries is a concept query.
In the case where a concept query is a disjunction of two or more conjunctive
concept queries, each conjunctive concept query and all the items that make it
true are said to be associated with a sub-concept of the parent product concept
(where the parent product concept is the disjunctive combination of the sub-
concepts). Any item that is associated with any of the sub-concepts is said to
belong to the parent product concept. By definition, a sub-concept is also a
product concept. We will refer to the concept query corresponding to a product
concept Cas QCin the text from now on. Also, for a given query Q, the items
from the database Dthat are filtered by the query are denoted by QpDq.
摘要:

AutomaticGenerationofProductConceptsfromPositiveExamples,withanApplicationtoMusicStreamingKshitijGoyal1,WannesMeert1,HendrikBlockeel1,EliaVanWolputte1,KoenVanderstraeten2,WouterPijpops2,andKurtJaspers21KULeuven,Belgium2Tunify,BelgiumAbstract.Internetbasedbusinessesandproducts(e.g.e-commerce,musicstr...

展开>> 收起<<
Automatic Generation of Product Concepts from Positive Examples with an Application to Music Streaming.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:1.61MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注