Bayesian Inference of Transition Matrices from Incomplete Graph Data with a Topological Prior Vincenzo Perri1 Luka V Petrovi c1 Ingo Scholtes21

2025-05-06 0 0 1.99MB 22 页 10玖币
侵权投诉
Bayesian Inference of Transition Matrices from Incomplete
Graph Data with a Topological Prior
Vincenzo Perri 1?, Luka V Petrovi´c 1?, Ingo Scholtes2,1
1 Data Analytics Group, Department of Informatics, University of Zurich, Switzerland
2 Chair of Machine Learning for Complex Networks, Center for Artificial Intelligence and Data Science,
Julius-Maximilians-Universit¨at W¨urzburg, Germany
?equal contribution
October 28, 2022
Abstract
Many network analysis and graph learning techniques are based on discrete- or continuous-
time models of random walks. To apply these methods, it is necessary to infer transition
matrices that formalize the underlying stochastic process in an observed graph. For weighted
graphs, where weighted edges capture observations of repeated interactions between nodes, it
is common to estimate the entries of such transition matrices based on the (relative) weights of
edges. However in real-world settings we are often confronted with incomplete data, which turns
the construction of the transition matrix based on a weighted graph into an inference problem.
Moreover, we often have access to additional information, which capture topological constraints
of the system, i.e. which edges in a weighted graph are (theoretically) possible and which are
not. Examples include transportation networks, where we may have access to a small sample of
passenger trajectories as well as the physical topology of connections, or a limited set of observed
social interactions with additional information on the underlying social structure. Combining
these two different sources of information to reliably infer transition matrices from incomplete
data on repeated interactions is an important open challenge, with severe implications for the
reliability of downstream network analysis tasks.
Addressing this issue, we show that including knowledge on such topological constraints can
considerably improve the inference of transition matrices, especially in situations where we only
have a small number of observed interactions. To this end, we derive an analytically tractable
Bayesian method that uses repeated interactions and a topological prior to perform data-efficient
inference of transition matrices. We compare our approach against commonly used frequentist
and Bayesian approaches both in synthetic data and in five real-world datasets, and we find
that our method recovers the transition probabilities with higher accuracy. Furthermore, we
demonstrate that the method is robust even in cases when the knowledge of the topological con-
straint is partial. Lastly, we show that this higher accuracy improves the results for downstream
network analysis tasks like cluster detection and node ranking, which highlights the practical
relevance of our method for interdisciplinary data-driven analyses of networked systems.
1 Introduction
Graph models of relational data have become a cornerstone in the analysis of complex systems [Boc-
caletti et al., 2006] and an important foundation for the application of machine learning to graph-
structured data from social, technical, and biological systems [Bronstein et al., 2017]. Many network
1
arXiv:2210.15410v1 [stat.ME] 27 Oct 2022
Available data:
Important nodes,
Relevant clusters,
...
Transition
matrix
Downstream
Analysis
Inference
Constraints
Observed
interactions
Figure 1: Steps in the analysis of network data. The first step is the inference of the transition matrix
from the network data. In the second step, the transition matrix is used to detect the important
nodes, relevant clusters and other interesting features of the system. The goals of our work are
to extend the network inference to use the topological constraints, to investigate the robustness of
the new method to partial knowledge of the constraints, and to investigate whether this improves
downstream analysis tasks.
analysis and graph learning techniques are based on discrete- or continuous-time models of ran-
dom walks [Masuda et al., 2017] such as, e.g., community detection algorithms like InfoMap [Ros-
vall and Bergstrom, 2008] or WalkTrap [Pons and Latapy, 2006], node ranking techniques like
PageRank [Page et al., 1999], neural graph embeddings like DeepWalk [Perozzi et al., 2014] or
node2vec [Grover and Leskovec, 2016], walk-based similarity scores that are the basis for link pre-
diction [Liben-Nowell and Kleinberg, 2007], or heat kernels for graphs used for community detection
[Kloster and Gleich, 2014] and node ranking [Chung, 2007]. To apply these methods, it is necessary
to obtain a transition matrix that formalizes the underlying stochastic process in the observed graph.
This is trivial when we have full information on repeated interactions in the graph, which enables us
to estimate transition probabilities between nodes based on relative frequencies of observed interac-
tions. However in real-world settings we are often confronted with incomplete data, which turns the
construction of the transition matrix into an inference problem that we need to address to obtain
reliable results.
In this work, we consider situations where we have access to a possibly incomplete set of observed
repeated interactions between a set of nodes. Such data can be represented as weighted graph, where
the weight of an edge corresponds to the number of observed interactions between a given node pair.
For such weighted graphs, it is common to define transition probabilities proportional to the edge
weights. From a statistical inference point of view, this method to infer the transition matrix based
on observed interactions corresponds to a frequentist approach that uses a maximum likelihood
estimation. When few observations are available, this simple approach suffers from overfitting:
On the one hand, unobserved interactions translate to zero transition probabilities even though
transitions may actually be possible in the underlying graph. On the other hand, those interactions
that were observed are likely to translate to overestimated transition probabilities. This generates a
large variance in inferred transition probabilities that can severely distort the results of downstream
network analysis and graph learning tasks.
However, repeated interactions are often not the only information that we have about the net-
worked systems that we study. We often have additional knowledge about topological constraints
that determine which of the interactions are theoretically possible and which others are not. For ex-
ample, consider a transportation system, where observed interactions represent passengers travelling
between connected stations, or a social network, where interactions represent messages transferred
between users, or a Web graph, where interactions represent users clicking on a hyperlink between
two Web pages. In the first case, the movement of passengers is constrained by the available trans-
2
portation infrastructure, in the second, the spreading of information is constrained by existing social
connections, while in the third case, available hyperlinks constrain the possible clickstreams of users.
Such constraints can limit the number of parameters of the model that we seek to infer. They can
thus help us to address the reliable inference of transition matrices and improve the results of down-
stream analyses. Therefore, in this paper, we address the following research questions:
Q1 How can we use information on topological constraints to improve the inference of transition
matrices from incomplete data on repeated interactions captured in weighted graphs?
Q2 How is the inference of transition matrices influenced by a partial knowledge of the underlying
topological constraints, which is often the case in real-world settings?
Q3 To what extent can our proposed approach to include topological constraints in the inference
of transition matrices improve the performance of downstream network analysis tasks like node
ranking and community detection?
The remainder of this article is structured as follows: In Section 2, we formally define the inference
problem that we address in our work and we introduce two methods that are commonly used to
infer transition matrices without leveraging topological constraints, namely maximum likelihood
estimation and a “naive” Bayesian approach. Addressing Q1, in Section 3 we introduce BaCon,
a Bayesian approach that uses a topological prior to improve the inference of transition matrices
in incomplete data on repeated interactions in a graph. In Section 4, we introduce the datasets
and the experimental setup that we use to evaluate our method. We next compare BaCon to
the methods introduced in Section 2 (i.e. frequentist and naive Bayes approach) that do not use
information on topological constraints. In Section 5, we evaluate the extent to which the inclusion of
the topological constraints improves the network inference, and address Q2, observing the effects of
partial knowledge of the constraint. In Section 6, we address Q3 and explore whether the effects of
the network inference carry over to the network analysis results. In Section 7, we show how inference
of diverse examples of real-world networks can be further improved with an appropriate choice of
prior parameters. In Section 8 we discuss how our work complements existing network analytic
methods in the field. We finally summarize our conclusions and outline future work in Section 9.
The results of our study show that (i) the inclusion of topological constraints considerably im-
proves the inference of transition matrices in network data, and (ii) that this improved inference
translates to an increased accuracy for downstream network analysis tasks. Our work highlights the
importance of treating the construction of network models from (partially observed) interactions as
an inference problem that can be addressed using Bayesian statistics. Our results further suggest
that constraints for the topology of interactions in complex systems should be given more attention
as their inclusion can significantly improve modelling accuracy, especially in the limit of small data.
Given the importance of network inference in partially observed or noisy data we expect our work to
be of interest for a broad interdisciplinary community. An implementation of our method is available
as an Open Source project [Zenodo].
2 Background
This section introduces the mathematical formalism for the inference of transition matrices from
observations on repeated interactions and outlines two existing strategies for addressing it. We
assume that we observe a multiset of pairwise interactions (i, j) between nodes i, j V, we denote
it with Eand indicate the number of times the interaction (i, j) occurred with nij . In addition, we
are given a directed graph G= (V, E) that represents the topological constraints, i.e., which edges
can be observed and which are impossibile. Given these sources of data, our goal is to infer the
transition matrix Tthat best reflects the actual transition probabilities in the system.
3
Figure 2: Illustration of methods to infer transition matrices from repeated interactions on a
downstream task of clustering. The first panel shows the ground truth clusters encoded in the
transition probabilities of a geometric random network with Euclidean metric in which we observe
small set of interactions. We use three different inference methods to construct a transition matrix
and detect communities using InfoMap. The other panels show the clusters detected using the
transition matrix inferred with a frequentist, naive Bayesian, and our approach - BaCon. BaCon
finds clusters that are closer to the ones detected from the ground truth matrix.
The simplest option is to assign the transition probabilities in a frequentist way, i.e. proportion-
ally to the observation counts nij and thus πij =nij /PkVnik. This corresponds to a maximum
likelihood estimation (MLE) of a multinomial distribution p(j|i). The advantages of this approach
are its simplicity and good performance when we have a data set that is sufficiently large considering
the space of possible interactions. In fact, this method is so common that it has become a standard
“preprocessing” step, which is rarely even mentioned as a method to “infer” a graph model from
observational data. However, this simple method can require a large number of observations to
assign non-zero probabilities to all edges that are possible based on the underlying topology of the
system. In a nutshell, when few observations are available, it is difficult to determine whether an
edge with probability zero is not possible or whether it has not been observed yet. From a machine
learning point of view, the erroneous interpretation of edges with zero probability as evidence of the
absence of the associated edge corresponds to an overfitting of the weighted graph model.
An alternative to a frequentist approach is to use a Bayesian one. The naive Bayesian approach
addresses the issue of overfitting by recording the distribution of parameters πij for a given data
set. For every node i, we organize the parameters πij in vectors ~πi= (πij )jV. A priori, we
would assign, e.g., a uniform prior over the space of transition probabilities: p(( ~πi)iV) = const. =
QiVDir(πi|~αi=~
1N),where Dir denotes the Dirichlet distribution, ~αi’s are its concentration param-
eters, and ~
1Nis the N-dimensional vector with all components equal to 1. This choice of parameters
corresponds to the uniform distribution. The naive Bayesian approach further uses Bayes’ rule to up-
date the prior distribution of transition probabilities: p((~πi)iV|E) = p(E|(~πi)iV)p((~πi)iV)/p(E).
An advantage of this method is that unobserved edges are still modeled with non-zero probabili-
ties. However, this also introduces problems: First, since all transitions are modelled with non-zero
probabilities, we cannot directly use sparse matrices, which complicates applications to large net-
works. Second, and more importantly, in the typical case of networks with sparse topologies, a
large amount of data is required to overcome the uniform prior on a fully connected graph. From a
machine learning point of view, this corresponds to an underfitting of the weighted graph model.
In the first three panels of Fig. 2, we use a toy example to illustrate the issues of frequentist
and naive Bayesian network inferences. In the toy example, we consider a ground-truth transition
matrix constrained to a random geometric graph topology (first panel). The network consists of
four clusters expressed in higher transition probabilities between nodes within the same cluster. We
draw a small random sample of repeated interactions from the distribution of edge probabilities in
the ground truth network. We apply the frequentist (second panel) and the naive Bayesian method
4
(third panel) to infer a transition matrix. We further visualize the result of the popular community
detection technique InfoMap on the resulting transition matrix. The frequentist approach detects
many spurious communities because of its overfitting issue. The Bayesian approach detects a single
community because the observed data is insufficient for overcoming the prior.
Neither the frequentist approach, nor the naive Bayesian, use the information on the constraints
G= (V, E). Addressing this issue, in the next section, we formally introduce a Bayesian method that
leverages topological constraints (BaCon) that are often known in real-world networked systems. In
the fourth panel of Fig. 2, we show a representative example that illustrates how inclusion of such
constraints in the network inference improves the detection of clusters.
3 BaCon: Bayesian Constrained Network Inference
In this section, we show how the knowledge of topological constraints can be used as a prior to infer
transition probabilities from observations of interactions (Q1). We formally introduce a Bayesian
method (BaCon) that leverages topological constraints that are often available for real-world net-
worked systems.
We assume that we are given a set of nodes Vof size N, and a set of possible edges EV×V
that represent a topological constraint for our inference task. I.e. we assume that an interaction
between node iand node jcannot occur if (i, j)/E. For each node i, we denote its possible
successors as S(i). We also assume that we are given a data set Eof interactions that were observed
in the system. Our goal is to infer the best directed weighted graph model from this data set. The
transition probabilities p(j|i) = πij from ito its successors jS(v) satisfy:
X
jS(i)
πij = 1,i, j : 0 < πij <1 (1)
We are interested in the probability density of parameters πij . We again organize them in vectors ~πi,
similarly to the naive Bayesian approach introduced in the previous section, but this time only over
the possible successors: ~πi= (πij )jS(i). We capture the information about impossible transitions
using a Dirac delta function as prior for the πij . Since those transitions can not occur in the data,
their probability distribution will not change. For the possible transitions from a node i, we assume
a uniform prior over the parameter space:
p(~πi|G) = Dir ~πi
~αi=α~
1|S(i)|(2)
where, for the uniform distribution, we make a choice of α= 1. The choice of α < 1 means that the
prior distribution of the probabilities is more skewed, and the choice of α > 1 means that the prior
distribution of the probabilities is more peaked around the value 1/|S(i)|. In the following we denote
all transition probabilities as π:= (~πi)iV. The likelihood of observing data E, which contains nij
observations of an edge (i, j), is given by the multinomial distribution:
p(E|π, α, G) = ZY
iY
jS(i)
πnij
ij (3)
where Zdenotes the number of permutations of observations. We use Bayes’ rule to update our a
priori distribution after observing data E:
p(π|E, α, G) = p(E|π, α, G)×p(π|α, G)
p(E|α, G)(4)
As the multinomial and the Dirichlet distributions are conjugate distributions, the posterior is also
a Dirichlet distribution with parameters ~αi=~α0
i+~ni,where ~α0
idenotes the a priori concentration
5
摘要:

BayesianInferenceofTransitionMatricesfromIncompleteGraphDatawithaTopologicalPriorVincenzoPerri1?,LukaVPetrovic1?,IngoScholtes2;11DataAnalyticsGroup,DepartmentofInformatics,UniversityofZurich,Switzerland2ChairofMachineLearningforComplexNetworks,CenterforArti cialIntelligenceandDataScience,Julius-Max...

展开>> 收起<<
Bayesian Inference of Transition Matrices from Incomplete Graph Data with a Topological Prior Vincenzo Perri1 Luka V Petrovi c1 Ingo Scholtes21.pdf

共22页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:22 页 大小:1.99MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 22
客服
关注