Dependence matters Statistical models to identify the drivers of tie formation in economic networks Giacomo De Nicola1 Cornelius Fritz2 Marius Mehrl3 Göran Kauermann1_2

2025-05-06 0 0 1.28MB 33 页 10玖币
侵权投诉
Dependence matters: Statistical models to identify the
drivers of tie formation in economic networks
Giacomo De Nicola1,, Cornelius Fritz2, Marius Mehrl3, Göran Kauermann1
Department of Statistics, LMU Munich, Ludwigstr. 33, 80539 Munich, Germany1
Department of Statistics, Pennsylvania State University, 100 Thomas Building,
16802 State College, PA, USA2
School of Politics and International Studies, University of Leeds, LS2 9JT Leeds,
United Kingdom3
Abstract
Networks are ubiquitous in economic research on organizations, trade, and many
other areas. However, while economic theory extensively considers networks, no gen-
eral framework for their empirical modeling has yet emerged. We thus introduce two
different statistical models for this purpose – the Exponential Random Graph Model
(ERGM) and the Additive and Multiplicative Effects network model (AME). Both
model classes can account for network interdependencies between observations, but
differ in how they do so. The ERGM allows one to explicitly specify and test the
influence of particular network structures, making it a natural choice if one is sub-
stantively interested in estimating endogenous network effects. In contrast, AME
captures these effects by introducing actor-specific latent variables affecting their
propensity to form ties. This makes the latter a good choice if the researcher is
interested in capturing the effect of exogenous covariates on tie formation without
having a specific theory on the endogenous dependence structures at play. After
introducing the two model classes, we showcase them through real-world applica-
tions to networks stemming from international arms trade and foreign exchange
activity. We further provide full replication materials to facilitate the adoption of
these methods in empirical economic research.
Keywords – Inferential Network Analysis, Network Data, Endogeneity, Arms Trade,
Foreign Exchange Networks, Statistical Modeling
JEL classification – C20, C49, F14, F31, L14
Declaration of interest: none
1 Introduction
The study of networks has established itself as a central topic in economic research (Jack-
son, 2008). Within the broader context of the study of complex and interdependent sys-
tems (see e.g. Flaschel et al., 1997, 2007, 2018), networks can be defined as interconnected
structures which can naturally be represented through graphs. In the economic litera-
ture, networks have been extensively considered from a theoretical perspective, with the
primary goal of understanding how economic behavior is shaped by interaction patterns
(Jackson and Rogers, 2007). Indeed, the adequate modelling of such interactions has been
described as one of the main empirical challenges in economic network analysis (Jackson
et al., 2017). Research in this direction on, e.g., organizations as networks, diffusion in
networks, network experiments, or network games, is surveyed in Bramoullé et al. (2016),
Jackson (2014), and Jackson et al. (2017). These theoretical advances find application in
Corresponding author: giacomo.denicola@stat.uni-muenchen.de
1
arXiv:2210.14860v3 [stat.ME] 7 Jul 2023
many different fields in which network structures naturally arise, such as national and in-
ternational trade, commercial agreements, firms’ organization, and collaboration activity.
However, such advances have not yet been accompanied by a corresponding shift in the
standard methods used to empirically validate them. Some recent contributions (see e.g.
Atalay et al., 2011; Chaney, 2014; Morales et al., 2019) develop estimators tailored specif-
ically to their network-based theoretical models, but more generally applicable modeling
frameworks for the analysis of real-world network data have not yet emerged. Statistical
methods specifically designed to empirically test theories where interdependencies arise
from network structures, such as the Exponential Random Graph Model (ERGM), exist
but are not yet widely used by economists. Jackson (2014), for instance, discusses ERGMs
but argues that they “suffer from proven computational problems” (2014, p.76). Jackson
et al. (2017) further explain that “it is practically impossible to estimate the likelihood of
a given network at even a moderately large scale”, concluding that with ERGMs, “there is
an important computational hurdle that must be overcome in working with data” (2017,
p.85).
Contrasting this assessment, we argue that recent work in the realm of empirical
network analysis provides robust and scalable methods with readily available implemen-
tations in the Rstatistical software (R Core Team, 2021). Computational issues thus
do not represent an insurmountable barrier to employ robust inferential network meth-
ods anymore. In this paper, we demonstrate the effectiveness and usability of some of
those methods by applying them to real economic data. We specifically focus on models
which aim to capture the mechanisms leading to network formation, i.e. to measure how
the probability of forming a tie is influenced by (a) nodal characteristics, (b) pairwise
covariates, and (c) the rest of the network. In particular, our focus is on Exponential
Random Graph Models (ERGM) (Robins et al., 2007) and Additive and Multiplicative
Effect (AME) network models (Hoff, 2021), respectively implemented in the Rpackages
statnet (Handcock et al., 2008) and amen (Hoff, 2015). We find these two model classes
to be among the most promising ones for applications in the economic sciences, as they
are well suited for answering two broad categories of research questions. The ERGM is
an ideal fit if, based on economic theory, the researcher envisages a particular dependence
structure for the existence of ties in the network at hand and wants to test whether their
theory is corroborated by empirical data. On the other hand, AME, and more generally
continuous latent variable models, are a good choice when the researcher is interested in
capturing the effect of exogenous variables on tie formation without having prior knowl-
edge on which endogenous network dependence mechanisms are at play. In this case,
AME offers the possibility to estimate the effect of both nodal and pairwise covariates
while simultaneously controlling for network effects, which may induce bias if ignored (see
Lee and Ogburn, 2021). In addition, the estimated latent structure can provide insight
on the underlying network mechanisms for which they are controlling.
The principal aim of this paper is to showcase ERGM and AME by focusing on
their value for economic research. After introducing each model class, we demonstrate
their empirical usage by respectively applying them to two relevant economic questions
stemming from real-world networks. We first use the ERGM to model the international
trade of major conventional weapons, where a directed tie exists if one country transfers
arms to another. In line with Chaney (2014), network effects such as directed triadic
closure (e.g. the positive impact of an increase in the volume of trade between countries
A and B on the probability that country C, that already exports to A, starts exporting
to B) are of explicit theoretical interest in this application, and the ERGM allows for
2
their proper specification and testing. We then make use of the AME model to study a
historical network of global foreign exchange activity, where a directed edge is present if
one country’s national currency is actively traded within the other country. AME allows us
to estimate how relevant country features, such as per-capita gdp and the gold standard,
and pairwise covariates, such as the distance between two countries and their reciprocal
trade volume, influence tie formation, while controlling for network effects to provide
unbiased estimates. We further compare the two model classes, weighing pros and cons
of each approach and providing guidance on which tool is appropriate for applications to
different empirical settings and research questions. Finally, in addition to a step-by-step
analysis and interpretation of these application cases, we provide full replication code in
our GitHub repository1, allowing for seamless reproducibility. We, therefore, demonstrate
the “off-the-shelf” applicability of these methods, and offer applied researchers a head-
start in employing them to study substantive economic problems.
Our contribution is related to various strands of the growing literature on economic
networks (e.g. Jackson and Rogers, 2007; Jackson, 2008; Bramoullé et al., 2016). Due
to its focus on economic questions, our work differs from surveys in physics (Newman,
2003), statistics (Goldenberg et al., 2010), or political science (Cranmer et al., 2017).
Several articles provide overviews and surveys of existing economic network models from
a theoretical perspective (Jackson, 2014; Graham, 2015; Jackson et al., 2017; De Paula,
2020). None of these articles concentrates on discussing broadly applicable statistical
modeling frameworks, such as ERGM and AME, from an empirical perspective. In this
sense our paper is similar in spirit to van der Pol (2019) who, however, only focuses on
ERGM, without comparing alternative approaches. Indeed, one of the goals of this paper
is to shed light on the emerging AME model class (and, more generally, on latent variable
network models) for future applications in the economic literature.
The remainder of the paper is structured as follows. Section 2 discusses existing
literature and presents the mathematical and notational framework used to define and
discuss networks throughout the paper. Section 3 introduces the ERGM and applies it to
the international arms trade network. Section 4 is dedicated to AME and its application to
the global foreign exchange network. Section 5 concludes the paper with a brief discussion
on the two model classes, contrasting their different uses and highlighting pros and cons
of each approach.
2 Economic Networks
2.1 Related literature
Even though network structures naturally arise in many aspects of economics and are
subject of prominent research in the field, much of the previous literature has ignored
the implied interdependencies, instead opting for regression models assuming ties to be
independent conditional on the covariates (e.g. Anderson and Van Wincoop, 2003, Rose,
2004, Lewer and Van den Berg, 2008). This assumption is often unreasonable in practice.
It would, for example, imply that Germany imposing economic sanctions on Russia is
independent of Italy imposing sanctions on Russia, and, in the directed case, even of
Russia imposing them on Germany itself. While no standard framework for the modeling
of empirical network data has emerged in economics so far, a number of contributions in
1https://github.com/gdenicola/statistical-network-analysis-in-economics
3
– or adjacent to – the field do make use of statistical network models. We shortly survey
these works here to show that the models we present are indeed suitable for the analysis of
economic data. Possibly the most obvious kind of economic network is the international
trade network (see Chaney, 2014) and many of these studies accordingly seek to model the
formation of trade ties. In this vein, two early studies (Ward and Hoff, 2007; Ward et al.,
2013) apply latent position models to show that trade exhibits a latent network structure
beyond what a standard gravity model can capture (see also Fagiolo, 2010; Dueñas and
Fagiolo, 2013). More recently, numerous contributions have used the ERGM to explicitly
theorize and understand network interdependence in the general trade (Herman, 2022;
Liu et al., 2022; Smith and Sarabi, 2022) as well as the trade in arms (Thurner et al.,
2019; Lebacher et al., 2021), patents (He et al., 2019), and services (Feng et al., 2021).
That being said, empirical research on economic networks is not limited to trade.
Smith et al. (2019) use multilevel ERGMs to study a production network consisting of
ownership ties between firms at the micro-level and trade ties between countries at the
macro-level, while Mundt (2021) explores the European Union’s sector-level production
network via ERGMs as well as an alternative methodology, the stochastic actor-oriented
model (SAOM). The latter is another prominent tool in the realm of network analysis,
which is suitable for modeling longitudinal network data. As we, in the interest of brevity,
focus on models for static networks (i.e. networks that are observed only at one point
in time), we do not treat the SAOM, and instead refer to Snijders (1996, 2017) for an
introduction to the model class. Going back to empirical research on economic networks
in the literature, Fritz et al. (2023) deploy ERGMs to investigate patent collaboration
networks. Studies on foreign direct investments document network influences using latent
position models (Cao and Ward, 2014), or seek to model them via extensions of the ERGM
(Schoeneman et al., 2022). Finally, economists also study networks of interstate alliances
and armed conflict (see e.g. Jackson and Nei, 2015; König et al., 2017), both of which
have been modeled via ERGMs (Cranmer et al., 2012; Campbell et al., 2018) and AME
(Dorff et al., 2020; Minhas et al., 2022). This short survey indicates that both ERGM and
AME can be used to answer questions which are of substantive interest to economists.
2.2 Setup
Before introducing models for networks in which dependencies between ties are expected,
we briefly introduce the mathematical framework for networks, as well as the necessary
notation. Let y
y
y= (yi j)i,j=1,...,nbe the adjacency matrix representing the observed binary
network, comprising nfixed and known agents (nodes). In this context, yi j = 1 indicates
an edge from agent ito agent j, while yi j = 0 translates to no edge between the two. Since
self-loops are not admitted for most studied networks, the diagonal of y
y
yis left unspecified
or set to zero. Depending on the application, the direction of an edge can carry additional
information. If it does, we call the network directed. In this article, we mainly focus
on this type of networks. Also note that all matrix-valued objects are written in bold
font for consistency. In addition to the network connections, we often observe covariate
information on the agents, which can be at the level of single agents (e.g. the gdp of a
country) or at the pairwise level (e.g. the distance between two countries). We denote
covariates by x
x
x1, ..., x
x
xp, and our goal is to specify a statistical model for Y
Y
Y, that is the
random variable corresponding to y
y
y, conditional on x
x
x1, ..., x
x
xp. A natural way to do this
is to specify a probability distribution over the space of all possible networks, which we
define by the set Y. Two main characteristics differentiate our modeling endeavor from
4
classical regression techniques, such as Probit or logistic regression models. First, for
most applications, we only observe one realization y
y
yfrom Y
Y
Y, rendering the estimation
of the parameters to characterize this distribution particularly challenging. Second, the
entries of Y
Y
Yare generally co-dependent; thus, most conditional dependence assumptions
inherent to common regression models are violated. Generally, we term mechanisms that
induce direct dependence between edges to be endogenous, while all effects external to
the modeled network, such as covariates, are called exogenous.
3 The Exponential Random Graph Model
The ERGM is one of the most popular models for analyzing network data. First intro-
duced by Holland and Leinhardt (1981) as a model class that builds on the platform of
exponential families, it was later extended with respect to fitting algorithms and more
complex dependence structures (Lusher et al., 2012; Robins et al., 2007). We next intro-
duce the model step-by-step to highlight its ability to progressively generalize by building
on conditional dependence assumptions.
3.1 Accounting for dependence in networks
We begin with the simplest possible stochastic network model, the Erdös-Rényi-Gilbert
model (Erdös and Rényi, 1959; Gilbert, 1959), where all edges are assumed to be inde-
pendent and to have the same probability of being observed. In stochastic terms, each
observed tie is then a realization of a binomial random variable with success probability
π, which yields
Pπ(Y
Y
Y=y
y
y) =
n
Y
i=1 Y
j̸=i
πyi j (1 π)1yi j (1)
for the probability to observe y
y
y. Evidently, model (1), which implies equal probability
for all possible ties, is too restrictive to be applied to real world problems. In the next
step, we, therefore, additionally incorporate covariates xi j by letting πvary depending on
those covariates, leading to edge-specific probabilities πi j. Following the common practice
in logistic regression, we parameterize the log-odds by log πi j
1πi j =θxi j, where xi j is a
vector of exogenous statistics with the first entry set to 1 to incorporate an intercept, and
get
Pθ(Y
Y
Y=y
y
y) =
n
Y
i=1 Y
j̸=i exp{θxi j}
1 + exp{θxi j}!yi j 1
1 + exp{θxi j}!1yi j
.(2)
From (2), the analogy to standard logistic regression being a special case of generalized
linear models (Nelder and Wedderburn, 1972) becomes apparent. The joint distribution
of Y
Y
Ycan be formulated in exponential family form, yielding
Pθ(Y
Y
Y=y
y
y|x) = exp{θs(y
y
y)}
κ(θ),(3)
where s(y
y
y)=(s1(y
y
y), ..., sp(y
y
y)),sq(y
y
y) = Pn
i=1 Pj̸=iyi jxi j,qq= 1, ..., p, with xi j,qas qth
entry in xi j and κ(θ) = Qn
i=1 Qj̸=i(1 + exp{θxi j}). In the jargon of exponential families,
we term s(y
y
y)sufficient statistics.
5
摘要:

Dependencematters:StatisticalmodelstoidentifythedriversoftieformationineconomicnetworksGiacomoDeNicola1,∗,CorneliusFritz2,MariusMehrl3,GöranKauermann1DepartmentofStatistics,LMUMunich,Ludwigstr.33,80539Munich,Germany1DepartmentofStatistics,PennsylvaniaStateUniversity,100ThomasBuilding,16802StateColle...

展开>> 收起<<
Dependence matters Statistical models to identify the drivers of tie formation in economic networks Giacomo De Nicola1 Cornelius Fritz2 Marius Mehrl3 Göran Kauermann1_2.pdf

共33页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:33 页 大小:1.28MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 33
客服
关注