The Network Structure of Unequal Diusion Eaman Jahani Department of Statistics University of California Berkeley

2025-05-06 0 0 502.1KB 47 页 10玖币
侵权投诉
The Network Structure of Unequal Diffusion
Eaman Jahani
Department of Statistics, University of California, Berkeley
Dean Eckles
MIT Sloan School of Management
Alex “Sandy” Pentland
MIT Institute for Data, Systems and Society
September 15, 2022
Abstract
Social networks affect the diffusion of information, and thus have the potential
to reduce or amplify inequality in access to opportunity. We show empirically that
social networks often exhibit a much larger potential for unequal diffusion across
groups along paths of length 2 and 3 than expected by our random graph models. We
argue that homophily alone cannot not fully explain the extent of unequal diffusion
and attribute this mismatch to unequal distribution of cross-group links among the
nodes. Based on this insight, we develop a variant of the stochastic block model
that incorporates the heterogeneity in cross-group linking. The model provides an
unbiased and consistent estimate of assortativity or homophily on paths of length 2
and provide a more accurate estimate along paths of length 3 than existing models.
We characterize the null distribution of its log-likelihood ratio test and argue that the
goodness of fit test is valid only when the network is dense. Based on our empirical
observations and modeling results, we conclude that the impact of any departure
from equal distribution of links to source nodes in the diffusion process is not limited
to its first order effects as some nodes will have fewer direct links to the sources.
More importantly, this unequal distribution will also lead to second order effects as
the whole group will have fewer diffusion paths to the sources.
Keywords: Stochastic Block Model, Assortativity, Diffusion Paths, Brokerage, Heteroge-
neous Edge Propensities
Authors would like to thank Matthew Jackson for his valuable comments and feedback and Tiago
Peixoto for his helpful insights and references.
1
arXiv:2210.11053v1 [stat.AP] 20 Oct 2022
1 Introduction
Diffusion of information in social networks determines who gets access to a valueable piece
of information, such as a new investment opportunity. The structure of the network plays
an important role in which individuals or groups receive the valuable information. Certain
network structures are more likely to keep a piece of information exclusive to one group, thus
leading to unequal diffusion. For example, if there are very few social links between people
of different races, the information about a new employment opportunity that is generated
among one race might never reach individuals of the other race (Calv´o-Armengol and
Jackson, 2004). Many existing network models aim to explain the absence of diffusion from
one group to another through assortative mixing (Newman, 2003b). Assortative mixing,
or simply assortativity, captures the bias in forming edges with similar characteristics. It
is also referred to as homophily which simply means that attributes of nodes are correlated
across the edges. For example, in social networks individuals have a strong tendency to form
links with other people who are similar to them in terms of age, language, socioeconomic
status or race.
The stochastic block model (SBM) — along with its variants such as degree-correction
(Karrer and Newman, 2011) — defines an important class of these models that explicitly
account for assortative mixing in networks. SBM is a generative random network model
for modeling blocks or groups in networks. It has been widely used in computer science
and social sciences to model community structure in networks (Rohe et al., 2011; Holland
et al., 1983a; Anderson et al., 1992; Faust and Wasserman, 1992; Wasserman and Faust,
1989; Wang and Wong, 1987). In its original form, vertices in a network exclusively belong
to one of the Kgroups (or blocks) in the network. Each pair of vertices form an edge
independently of other edges or vertices. Edge formations between any pairs of two groups
are independent, identical and solely determined by the group membership of the pair of
vertices. If gi∈ {1,2, ..., K}corresponds to the group of vertex i, then a K×Kmatrix, P,
2
(a) (b)
Figure 1: Comparison of (a) a network with brokerage in which a disproportionate fraction
of cross-type edges are held by a small number of nodes versus (b) a similar network in
which cross-type edges are distributed more equally. Red and blue nodes correspond to two
different groups or blocks. Corresponding nodes have the same degree and the number of
cross-type edges are the same in both networks, but there are 10 cross-type paths of length
2 of the form red-blue-blue in network (b) while there are only 8 such paths in network (a).
determines the edge formation probabilities between any pair of vertices. The probability
of an edge between any pairs iand jis the (gi, gj) element in the matrix, Pgi,gj.
This simple model can produce a variety of interesting network structures. For example,
an edge probability matrix in which diagonal entries are much larger than off-diagonal
entries produces networks with densely connected groups and sparse connections across
groups. The ability to model such community structure is the main reason SBM can
capture assortative mixing in a network. This has led to the popularity of SBM as one of
the main methods for community detection. SBM does so by generating random networks
that match the observed network in terms of the frequency of within-group and cross-group
edges. The fitted model matches the observed assortativity or homophily in expectation.
SBM or its degree corrected version assume that within-group and cross-group edges
are distributed “uniformly” across all pairs: the existence of an edge between any two
pairs is identical to other similar pairs. In the case of degree-corrected SBM (DCSBM),
3
after conditioning on degree two nodes are similar in terms of their cross-group edge for-
mation. In reality, many real networks have heterogeneous propensities in edge formation
to various groups. In most cases, social networks exhibit a pattern of brokerage which
means cross-group edges are not distributed uniformly, instead a small subgroup of nodes
hold a disproportionate level of cross-group edges. Simmel (1950) was the first to intro-
duce the concept of network brokerage in triadic relations. Burt (2009) later advanced our
understanding of brokerage by introducing the concept of “structural holes” between two
unconnected communities, across which brokers act as intermediary. These broker nodes
play an important role in connecting otherwise disconnected communities, moving infor-
mation between them, and acting as an intermediary for resource exchanges. Due to their
unique position in the network, brokers benefit from various types of advantages, for ex-
ample access to diverse information or opportunities for arbitrage in exchanges. However,
these advantages to brokers might lead to some costs to other actors in the network or the
network as a whole.
Figure 1a provides a visual example of a network with brokerage in which a small number
of broker nodes have a higher propensity to form links with brokers of the out-group, hence
maintaining majority of cross-group edges. Figure 1b shows a similar network with less
brokerage which has more frequent cross-type paths of length 2 even though it has the
same degree distribution as the brokerage network 1a. While brokers play an integral role
in connecting otherwise disconnected communities, they can nevertheless act a bottleneck
by reducing the number of possible paths between any two groups when compared to a
similar network with cross-group ties uniformly distributed across the network. Because
brokers hold a disproportionate number of cross-group ties, they can constrain diffusion
of information from one group to another. In this paper, we argue that one needs to not
only look at homophily or assortativity on paths of length 1, but also on the extent of
assortativity of all possible diffusion paths of varying lengths to completely account for
unequal diffusion in networks. We then attempt to incorporate the heterogeneity in edge
4
propensities and in particular brokerage into class of Stochastic Block Models and show
that by doing so the model better explains unequal diffusion of information.
We show that while directly fitting for assortativity on paths of length 1, SBM fails to
accurately capture assortativity on longer paths in real world networks. In the context of
random graphs, network brokerage occurs when a few nodes in the network have higher
probability to connect with an out-group than other in-group nodes. By incorporating
this heterogeneity into our models of random network and in particular SBM or degree-
corrected SBM, we show that the generative model can better match the assortativity along
longer paths and more generally cross-type diffusion in the observed network. In section 2,
we discuss SBM and some variant models and show that they consistently under-estimate
the observed assortativity on paths of length 2 in 56 school networks, even though these
models explicitly accounts for assortativity on paths of length 1. In section 3, we discuss
a general framework for Stochastic Block Models and develop variants which account for
node heterogeneity in brokerage and by doing so match assortativity on paths of length 1
and 2 in expectation. In section 4, we provide the results from fitting the school networks
to our model and show that even though not explicitly modeled for, it closely matches
assortativity on paths of length 3. In section 5, we address the goodness of fit for this new
model versus one that does not account for brokerage. We characterize the distribution
of the log likelihood ratio statistic and argue that the test is valid only if the network is
dense, which is often not the case for social networks.
In the remainder of this paper, we mostly focus on assortativity of path length 2 and 3
as opposed to longer paths. While diffusion as a general process can occur across paths of
any length, nevertheless in many scenarios, especially those that involve access a valuable
resource, diffusion mostly occurs along short paths. Therefore, while assortativities along
paths of length 2 and 3 do not provide an exact representation of diffusion assortativity,
we believe they nevertheless provide a simple and interpretable model that is applicable in
most social contexts.
5
摘要:

TheNetworkStructureofUnequalDi usionEamanJahani*DepartmentofStatistics,UniversityofCalifornia,BerkeleyDeanEcklesMITSloanSchoolofManagementAlex\Sandy"PentlandMITInstituteforData,SystemsandSocietySeptember15,2022AbstractSocialnetworksa ectthedi usionofinformation,andthushavethepotentialtoreduceorampli...

展开>> 收起<<
The Network Structure of Unequal Diusion Eaman Jahani Department of Statistics University of California Berkeley.pdf

共47页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:47 页 大小:502.1KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 47
客服
关注