COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN NETWORKS

2025-04-27 0 0 1014.13KB 54 页 10玖币
侵权投诉
COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE
MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN
NETWORKS
DANAI DELIGEORGAKI, ALEX MARKHAM, PRATIK MISRA, AND LIAM SOLUS
Abstract.
We consider the problem of estimating the marginal independence
structure of a Bayesian network from observational data, learning an undirected
graph we call the unconditional dependence graph. We show that unconditional
dependence graphs of Bayesian networks correspond to the graphs having equal
independence and intersection numbers. Using this observation, a Gr¨obner
basis for a toric ideal associated to unconditional dependence graphs of Bayesian
networks is given and then extended by additional binomial relations to connect
the space of all such graphs. An MCMC method, called
GrUES
(Gr¨obner-based
Unconditional Equivalence Search), is implemented based on the resulting
moves and applied to synthetic Gaussian data.
GrUES
recovers the true marginal
independence structure via a penalized maximum likelihood or MAP estimate
at a higher rate than simple independence tests while also yielding an estimate
of the posterior, for which the 20% HPD credible sets include the true structure
at a high rate for data-generating graphs with density at least 0.5.
1. Introduction
Directed acyclic graphs (DAGs) are used to model conditional independence and
causal relations underlying complex systems of jointly distributed random variables.
For a DAG
D
= (
V, E
) with node set
V
=
{v1, . . . , vn}
and edge set
E
, the DAG
model M(D) is the set of probability density functions f(xv1, . . . , xvn) satisfying
(1) XviXV\nondesD(vi)|XpaD(vi)for all i∈ {1, . . . , n},
where
paD
(
vi
) =
{vjV
:
vjviE}
, and
nondesD
(
vi
) is the set of
vjV
for
which there is no directed path from
vi
to
vj
in
D
. A density is Markov to
D
if it
lies in
M
(
D
). Identifying a DAG to which a data-generating distribution is Markov
provides rudimentary causal information about the distribution by interpreting
(1)
as: Xviis independent of all variables not affected by Xvi, given its direct causes.
DAG models are fundamental in causal inference, where the aim is to infer causal
effects in a complex system [
26
]. This process often begins with causal discovery,
where one estimates a DAG to which the data-generating distribution is Markov. The
model
M
(
D
) is characterized the set of conditional independence relations encoded
by the d-separations in
D
[
16
]. Hence, DAGs with the same d-separations represent
Department of Mathematics, KTH Royal Institute of Technology, Sweden
E-mail addresses:{danaide, markham, pratikm, solus}@kth.se.
2020 Mathematics Subject Classification. Primary 62R01; Secondary 62H22, 60J22, 13F65,
62D20, 05C75.
Key words and phrases. marginal independence, unconditional equivalence, Bayesian networks,
causality, toric ideals, Gr¨obner bases, Markov chain Monte Carlo, intersection number, independence
number, minimal covers.
1
arXiv:2210.00822v3 [stat.ME] 31 Jan 2024
2 MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS
the same model and form a Markov equivalence class (MEC), limiting identifiability.
With observational data alone, and no additional parametric assumptions on the
data-generating distribution, we can only estimate a DAG up to its MEC [26].
In applications, such as in medicine and biology [
27
,
31
], one often uses additional
data collected via interventional experiments (e.g. randomized controlled trials)
to refine an MEC. Such experiments typically target a subset of variables in the
system, and the choice of these targets affects which elements in the class can be
rejected as candidates for the true causal system [
8
,
12
,
23
,
35
,
38
,
42
]. To do this
efficiently, it is desirable to have good methods for identifying targets. This problem
is often addressed via budget-constraints where only a function of the causal graph
is learned [
1
,
5
,
24
,
23
,
35
] or by active learning methods that identify optimal
targets given the graph estimate from previous experiments [
13
,
14
]. Such methods
may be less desirable when a single experiment is time-consuming; for example, in
large-scale knock-out experiments in gene regulatory networks [21].
An alternative approach is to identify a single set of targets for individual
intervention by estimating a set of possible source nodes in the true underlying
causal system. Since
XvXw
in a distribution Markov to a DAG
D
for any two
source nodes
v, w
of
D
, we can identify the collection of all marginally independent
nodes in the system, i.e., all pairs
v, w
for which
XvXw
. Furthermore, models
based on such low-order conditional independence relations can still provide useful
estimates of causal effects [
39
,
40
] and even isolate relevant biological processes
[
17
,
41
]. This can be useful in large systems where estimating a DAG may be
infeasible. Hence, estimating the marginal independence structure of the underlying
DAG can provide useful information in causal inference.
In this paper, we develop the combinatorial and algebraic theory for modeling
and estimating the marginal independence structure of a DAG model. There are
several contributions, which we break down in the following:
The combinatorics of unconditional equivalence. In Section 2, we provide a
framework for representing the marginal independence structure of a DAG using an
(undirected) unconditional dependence graph (UDG). UDGs were previously studied
in [
34
,
20
,
39
], in which characterizations of DAGs admitting the same marginal
independence structure were derived. In Theorem 2.2.3, we add to this theory by
providing four characterizations of the UDG of a DAG. We call the set of all DAGs
that have the same UDG an unconditional equivalence class (UEC). A UDG is thus
a representation of a UEC, which we call a UEC-representative.
Not all undirected graphs are UEC-representatives, but those that are possess
several useful combinatorial properties. In Theorem 2.3.4 we show that UEC-
representatives are exactly the undirected graphs whose independence number and
intersection number are equal. We further observe that UEC-representatives possess
a unique minimum edge clique cover that can be identified from any maximum
independent set of nodes in the graph. As a corollary, we show the generally NP-
Hard problems of computing the independence and intersection numbers (as well
as the associated maximum independent sets and minimum edge clique covers) are
solvable in polynomial time for UEC-representatives. This section is self-contained
and accessible given a background in graph theory.
The algebra of unconditional equivalence. In Section 3.2 we use our char-
acterization of UEC-representatives to define a toric ideal whose associated fibers
MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS 3
contain all UEC-representatives with a specified set of “source nodes” and pairwise
intersections and unions of their neighbors. A quadratic and square-free reduced
Gr¨obner basis for this toric ideal is identified (Theorem 3.2.9). By the Fundamental
Theorem of Markov Bases [
6
,
28
], the Gr¨obner basis gives a set of moves for exploring
the UEC-representatives within a fiber. In Section 4, we extend these moves via
additional binomial operations to a set of moves that completely connects the space
of all UEC-representatives on
n
nodes (Theorem 4.3.1). The resulting connectivity
theorem yields a method for exploring the space of UEC-representatives in the
language of binomials, which is applied in Section 6.1 to estimate the marginal
independence structure of a DAG model. This section uses classic results on Gr¨obner
bases and toric ideals. It makes use of the results derived in Section 2.
Complexity reduction. Using the algebraic methods developed in Sections 3
and 4, we obtain a search algorithm over the space of UEC-representatives on
n
nodes in the language of polynomials. However, the polynomials used in this search
are computationally inefficient when implemented directly. To reduce the complexity,
in Section 5.1, we introduce the DAG-reduction of a UEC-representative and prove
that the algorithm can be rephrased in terms of DAG-reductions so as to reduce
complexity. This reduction in complexity makes feasible an implementation of the
identified search method. To do this, it is shown that there exists DAGs in a given
UEC that are maximal in the UEC with respect to edge inclusion. We observe
that these maximal DAGs form a MEC contained within the UEC. It follows that
every UEC can be identified with a unique MEC of DAGs. The completed partially
directed acyclic graph (CPDAG) of this MEC is characterized, and then used to
produce the DAG-reduction of the UEC, which is more computationally efficient
than the UEC-representative in terms of both time and space complexity. The
results in this section are accessible to readers with knowledge of graphical models.
MCMC estimation of the marginal independence structure of a DAG
model. In Section 6the DAG reduction search in Section 5is implemented in
the form of a Markov Chain Monte Carlo method, called GrUES (Gr¨obner-Based
Unconditional Equivalence Search).
GrUES
can completely explore the space of
UEC-representatives, thereby making possible the identification of an optimal UEC-
representative for the data.
GrUES
also yields an estimate of the posterior distribution
of the UEC-representatives, allowing the user to quantify the uncertainty in the
estimated marginal independence structure.
In subsection 6.2, we apply
GrUES
to synthetic data generated from random linear
Gaussian DAG models to evaluate its performance empirically. It is benchmarked
against pairwise marginal independence testing, with performance evaluated for
varying numbers of nodes, graph sparsity and choices of prior, including a noninfor-
mative prior as well as a prior that allows the user to incorporate beliefs about the
number of source nodes in the data-generating causal system.
We observe that for relatively sparse or relatively dense models,
GrUES
successfully
identifies the marginal independence structure of the data-generating model at a
rate higher than that achieved via simple independence tests. Highest Posterior
Density (HPD) credible sets are also estimated that give relatively fine estimates
of the true UDG. These results suggest that
GrUES
provides an effective method
for the estimation of the marginal independence structure of a DAG model, while
allowing for the flexibility of incorporating prior knowledge about the causal system.
4 MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS
2. Unconditional Dependence
In this section we describe graphical representations for the marginal independence
structure of a data-generating distribution Markov to a DAG
D
. Our representative
of choice will be an undirected graph called the unconditional dependence graph of
the DAG. We first begin with some necessary preliminaries.
2.1. Preliminaries. Given a positive integer
n
, we let [
n
]
:
=
{
1
, . . . , n}
. Let
D
= (
VD, ED
) be a directed acyclic graph (DAG) with node set
VD
and edge set
ED
. When it is clear from context, we write
V
and
E
for the nodes and edges of
D
,
respectively. If |V|=nthen the n×nmatrix AD= [av,w] in which
av,w =(1 if vwE,
0 otherwise
is called the adjacency matrix of
D
. In an adjacency matrix, we identify
V
with [
n
]
and order the rows (columns) in increasing order from left-to-right (top-to-bottom).
The skeleton of a DAG
D
is the undirected graph given by forgetting edge directions
in
D
. For an undirected graph
U
= (
V, E
) on
n
nodes, the adjacency matrix of
U
is
AU
= [
av,w
] where
av,w
=
aw,v
= 1 if
vwE
and
av,w
= 0 otherwise. Vertices
v, w V
are called adjacent if
vw
,
wv
or
vw
is in
E
. Given an undirected
graph
U
and a vertex
vVU
we let
neU
(
v
) denote the open neighborhood of
v
, i.e.,
the set of nodes adjacent to
v
, and
neU
[
v
] :=
ne
(
v
)
∪ {v}
be the closed neighborhood
of
v
. We write
ne
(
v
) and
ne
[
v
] when the graph
U
is understood. If
vwE
then
v
is a parent of
w
and
w
is a child of
v
. A walk is a sequence of nodes (
v1, . . . , vm
)
such that
vi
and
vi+1
are adjacent for all
i
[
m
1]. A walk in which all nodes
are distinct is a path. A walk (
v1, . . . , vm
) in
D
is directed (from
v1
to
vm
) if
vivi+1 E
for all
i
[
m
1]. A directed walk in
D
is a directed path. If there is
a directed path from
v
to
w
in
D
we say
v
is an ancestor of
w
and
w
is a descendant
of
v
. For
AV
we define the parents,children,ancestors, and descendants of
A
to
be the union over all parents, children, ancestors and descendants of all nodes in
A
,
respectively. We let
paD
(
A
)
,chD
(
A
)
,deD
(
A
)
,
and
anD
(
A
) denote the set of parents,
children, descendants and ancestors of
A
in
D
, respectively. Note that
vdeD
(
v
)
and
vanD
(
v
). When the DAG
D
is understood, we drop the subscript
D
. A
collider is a pair of edges
tu, w u
(also written
tuw
). If
t
and
w
are
nonadjacent, then
tuw
is further called a v-structure. If a path contains the
edges
tu
and
wu
, then the vertex
u
is called a collider on the path. A path
is called blocked if it contains a collider. A colliderless path that does not repeat
any vertex is called a (simple) trek. (Note that trek has a more general definition
where the edges and vertices can be repeated. We will refer to this more general
trek as a colliderless walk.) Two subsets
A
and
B
of
V
are
d
-connected given
if
and only if there is a trek between some
vA
and
wB
. We let
A̸⊥DB
denote
that Aand Bare d-connected given in D. We say that Aand Bare d-separated
given , denoted ADB, if they are not d-connected given .
Here, we only defined d-connected and d-separated given the empty set, as this
will be sufficient for this paper. A more general definition in which
A
and
B
are
d-connected (d-separated) given a possibly nonempty set
C
is used to describe the
conditional independence relations associated to a DAG
D
. Given a DAG
D
= (
V, E
)
the DAG model
M
(
D
) is the collection of all distributions that are Markov to
D
(according to equation (1)).
MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS 5
Theorem 2.1.1 (Lauritzen
[16]
).The distribution of (
X1, . . . , Xn
)belongs to
M
(
D
)
if and only if XAXB|XCwhenever Aand Bare d-separated given Cin D.
An important observation to be made from the above theorem is that two different
DAGs
D
and
D
can satisfy
M
(
D
) =
M
(
D
) since it is possible that
D
and
D
have the same set of d-separation statements. Two such DAGs are called Markov
equivalent and are said to belong to the same Markov equivalence class (MEC).
2.2. The unconditional dependence graph of a DAG. When considering
jointly distributed random variables
X
= (
X1, . . . , Xn
), the term unconditional
independence or marginal independence refers to conditional independence statements
of the form
XAXB|XC
where
C
=
, i.e., the independence relations
XAXB
that hold in the joint distribution. If
X
is Markov to a DAG
D
in which
i
and
j
are distinct source nodes of
D
then
XiXj
. Hence, learning the marginal
independence structure of a model allows us to identify disjoint sets of nodes that
contain candidate source nodes for the DAG model, which can be useful in the
context of causal inference. This motivates the following definition.
Definition 2.2.1. The unconditional dependence graph of a DAG
D
= (
V, E
) is
the undirected graph UD= (V, {{v, w}:v̸⊥Dw;v, w V}).
When the DAG
D
is clear from context, we write
U
for
UD
. Similar to the case
of Markov equivalence of DAGs, it is possible that two distinct DAGs
D
and
D
encode the same set of unconditional d-separation statements
vw
. Two DAGs
D
= (
V, ED
) and
D
= (
V, ED
) are said to be unconditionally equivalent if whenever
two nodes
i, j V
are
d
-separated given
in
D
, the nodes
i
and
j
are d-separated
given
in
D
, i.e.,
UD
=
UD
. Markham et al.
[19
, Lemma 5
]
show that unconditional
equivalence is indeed an equivalence relation over the family of ancestral graphs (see
[
29
] for a definition) and consequently is also an equivalence relation over DAGs.
The collection of all DAGs that are unconditionally equivalent to
D
is called its
unconditional equivalence class (UEC). We represent each unconditional equivalence
class of DAGs by their unconditional dependence graph as
{U} :={D :UD=U}.
This is a collection of DAGs that is possibly different from the MEC of
D
. Since
UECs are defined in terms of a subset of the
d
-separations in a DAG, the partition
of DAGs on
n
nodes into MECs is a refinement of the partition of DAGs into UECs;
i.e., each UEC can be written as a union of certain MECs. Hence, esimating the
unconditional dependence graph
U
of a DAG
D
gives a representative of all MECs
of DAGs that encode the same set of unconditional independence relations.
We now derive four characterizations of the unconditional dependence graph
UD
of a DAG
D
to be used in methods for estimating the marginal independence
structure of
D
from data. These characterizations are presented in Theorem 2.2.3,
whose statement requires the following definitions:
We say that an ordered pair (
v, w
) (or an edge
vw
) is implied by transitivity in
D
if
vanD
(
w
)
\
(
{w} ∪ paD
(
w
)). The set of maximal ancestors of
A
in
D
, denoted
maD
(
A
), is the set of all
vanD
(
A
) for which
anD
(
v
) =
{v}
. A node
vV
is
called a source node of
D
if
paD
(
v
) =
. It follows that
maD
(
V
) is the collection of
all source nodes in
D
. We say an ordered pair (
v, w
) (or an edge
vw
) is partially
weakly covered if
maD
(
v
)
maD
(
w
),
v /anD
(
w
),
w /anD
(
v
) and
paD
(
v
)
̸
=
.
摘要:

COMBINATORIALANDALGEBRAICPERSPECTIVESONTHEMARGINALINDEPENDENCESTRUCTUREOFBAYESIANNETWORKSDANAIDELIGEORGAKI,ALEXMARKHAM,PRATIKMISRA,ANDLIAMSOLUSAbstract.WeconsidertheproblemofestimatingthemarginalindependencestructureofaBayesiannetworkfromobservationaldata,learninganundirectedgraphwecalltheunconditio...

展开>> 收起<<
COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN NETWORKS.pdf

共54页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:54 页 大小:1014.13KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 54
客服
关注