
P2P
P2P
P2P
P2P
P2P
P2P
P2B
P2B
P2B
P2B
P2B
P2B
P2B
INVITES
INVITES
INVITES
INVITES
INVITES
INVITES
INVITES
INVITES
INVITES
P2P
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2P
P2B
P2B
P2B
P2B
INVITES
INVITES
P2P
P2P
P2P
P2P
P2P
P2P
P2P
P2P
P2P
P2P
P2B
P2B
P2B
P2B
P2B
INVITES
P2P
P2P
P2P
P2P
P2P
P2P
INVITES
P2P
P2B
P2P
P2P
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2P
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
INVITES
P2P
INVITES
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
P2B
INVITES
INVITES
P2P
P2P
P2P
Fig. 1: A small portion of the dataset. Users and merchants are
represented as orange and blue nodes, respectively. Invitations
among users are represented with bold orange links, P2B
transactions correspond to blue links, P2P money transfers are
displayed in green.
and to exchange money with each other. We applied our
methodology to analyze the YAP platform’s community, with
the goal of supporting business decisions. Our results show
that the approach is a practical tool to support marketing
campaigns and, more in general, business decisions.
II. DATASET AND ITS GRAPH REPRESENTATION
To develop and evaluate our methodology, we take a data-
driven approach and use as a reference a dataset collected
from an operational payment platform. The dataset comes
from the Italian app YAP1, a payment platform provided by
Nexi2, one of the biggest European players in digital payments.
YAP is based on a mobile application linked to a prepaid
card (accepted by online and physical stores) that also allows
its customers to exchange money with friends and contacts
without fees. In this paper, we use data from the production
databases of YAP, which include a set of transactions for the
years 2019, 2020 and 2021, as well as metadata about users
and merchants.
The dataset can be naturally represented in terms of a
heterogeneous graph, since there are entities that are related
to each other. In particular, we have identified three types of
relationships that reflect the three main types of interactions
between users and merchants.
1https://www.yap-app.it
2https://www.nexigroup.com/en/
1) Users are connected to merchants by “P2B” relation-
ships, representing monetary transactions characterized
by their date, amount and channel, which may be online
(i.e., e-shops) or offline (i.e., physical stores).
2) Users may transfer money to other users. This kind of
interaction is represented by “P2P” relationships among
users, which are characterized by their date and amount.
3) Finally, users may invite new users to join the platform.
This results in “Invite” relationships, whose tail and
head nodes correspond to users sending and accepting
the invitation, respectively. Note that we only model
invitations that resulted in the acquisition of a new users.
These relationships are characterized by a timestamp. Hence
we have a dynamic graph, with edges appearing and disap-
pearing over different time windows.
We sketch a small portion on this heterogeneous graph
in Figure 1, where users and merchants are connected with
three types of edges. For privacy reasons, we anonymize the
dataset by removing personally identifiable information. As a
result, users and merchants are identified by unique numeric
identifiers. Each user is associated with some personal details
(age, gender, place of residence, occupation), while merchants
are characterized by a category indicating the type of activity
and the province of their retail store.
We store our dataset in the graph database Neo4j3, which
provides a native representation of graph data, so we could
efficiently traverse the graph, query it for patterns and visualize
the resulting information. The dataset is quite large and
includes a number of nodes in the range (106,107)and a
number of relationships in the range (107,108).4
For our methodology, the “Invitation Network” plays a rele-
vant role. It simply represents the network of users connected
by the “Invite” relationships. Formally, we define it as the
subgraph G= (V,E)of our dataset comprising all users Vand
the invitation relationships among them.5The edges Ethen
represent the “Invite” relationships among couples of nodes
(u, v)∈ E ⊂ V × V. The Invitation Network Gplays a key
role in the development of our methodology, as it captures the
temporal evolution of the YAP network in terms of new users
acquired through accepted invitations. We therefore briefly
characterize its main topological features. First, we note that
the invitation graph has a special structure: Gis a forest, i.e.,
each weakly connected component (WCC) of Gis a directed
tree, since each user can send many invitations but he can
accept only one. An example of a WCC from the dataset is
shown in Figure 2. The top user sent several invitations, 8 of
which were accepted. Some users in turn invited other users,
forming a WCC with a total of 34 users. The size of the WCCs
varies from small single-user or two-user components (none
or a single accepted invitation) to subgraphs with hundreds
of users. In Figure 3, we show the distribution of WCC size
in terms of a complementary cumulative distribution function
3https://neo4j.com
4We cannot disclose the exact numbers and ranges as they represent trade
secrets.
5Merchants cannot invite neither users or other merchants to the platform.