COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN NETWORKS

2025-04-27 0 0 1014.13KB 54 页 10玖币

侵权投诉

COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE

MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN

NETWORKS

DANAI DELIGEORGAKI, ALEX MARKHAM, PRATIK MISRA, AND LIAM SOLUS

Abstract.

We consider the problem of estimating the marginal independence

structure of a Bayesian network from observational data, learning an undirected

graph we call the unconditional dependence graph. We show that unconditional

dependence graphs of Bayesian networks correspond to the graphs having equal

independence and intersection numbers. Using this observation, a Gr¨obner

basis for a toric ideal associated to unconditional dependence graphs of Bayesian

networks is given and then extended by additional binomial relations to connect

the space of all such graphs. An MCMC method, called

GrUES

(Gr¨obner-based

Unconditional Equivalence Search), is implemented based on the resulting

moves and applied to synthetic Gaussian data.

GrUES

recovers the true marginal

independence structure via a penalized maximum likelihood or MAP estimate

at a higher rate than simple independence tests while also yielding an estimate

of the posterior, for which the 20% HPD credible sets include the true structure

at a high rate for data-generating graphs with density at least 0.5.

1. Introduction

Directed acyclic graphs (DAGs) are used to model conditional independence and

causal relations underlying complex systems of jointly distributed random variables.

For a DAG

= (

V, E

) with node set

{v1, . . . , vn}

and edge set

, the DAG

model M(D) is the set of probability density functions f(xv1, . . . , xvn) satisfying

(1) Xvi⊥⊥ XV\nondesD(vi)|XpaD(vi)for all i∈ {1, . . . , n},

where

paD

(

) =

{vj∈V

vj→vi∈E}

, and

nondesD

(

) is the set of

vj∈V

for

which there is no directed path from

. A density is Markov to

if it

lies in

(

). Identifying a DAG to which a data-generating distribution is Markov

provides rudimentary causal information about the distribution by interpreting

(1)

as: Xviis independent of all variables not aﬀected by Xvi, given its direct causes.

DAG models are fundamental in causal inference, where the aim is to infer causal

eﬀects in a complex system [

]. This process often begins with causal discovery,

where one estimates a DAG to which the data-generating distribution is Markov. The

model

(

) is characterized the set of conditional independence relations encoded

by the d-separations in

[

]. Hence, DAGs with the same d-separations represent

Department of Mathematics, KTH Royal Institute of Technology, Sweden

E-mail addresses:{danaide, markham, pratikm, solus}@kth.se.

2020 Mathematics Subject Classiﬁcation. Primary 62R01; Secondary 62H22, 60J22, 13F65,

62D20, 05C75.

Key words and phrases. marginal independence, unconditional equivalence, Bayesian networks,

causality, toric ideals, Gr¨obner bases, Markov chain Monte Carlo, intersection number, independence

number, minimal covers.

arXiv:2210.00822v3 [stat.ME] 31 Jan 2024

2 MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS

the same model and form a Markov equivalence class (MEC), limiting identiﬁability.

With observational data alone, and no additional parametric assumptions on the

data-generating distribution, we can only estimate a DAG up to its MEC [26].

In applications, such as in medicine and biology [

], one often uses additional

data collected via interventional experiments (e.g. randomized controlled trials)

to reﬁne an MEC. Such experiments typically target a subset of variables in the

system, and the choice of these targets aﬀects which elements in the class can be

rejected as candidates for the true causal system [

]. To do this

eﬃciently, it is desirable to have good methods for identifying targets. This problem

is often addressed via budget-constraints where only a function of the causal graph

is learned [

] or by active learning methods that identify optimal

targets given the graph estimate from previous experiments [

]. Such methods

may be less desirable when a single experiment is time-consuming; for example, in

large-scale knock-out experiments in gene regulatory networks [21].

An alternative approach is to identify a single set of targets for individual

intervention by estimating a set of possible source nodes in the true underlying

causal system. Since

Xv⊥⊥ Xw

in a distribution Markov to a DAG

for any two

source nodes

v, w

, we can identify the collection of all marginally independent

nodes in the system, i.e., all pairs

v, w

for which

Xv⊥⊥ Xw

. Furthermore, models

based on such low-order conditional independence relations can still provide useful

estimates of causal eﬀects [

] and even isolate relevant biological processes

[

]. This can be useful in large systems where estimating a DAG may be

infeasible. Hence, estimating the marginal independence structure of the underlying

DAG can provide useful information in causal inference.

In this paper, we develop the combinatorial and algebraic theory for modeling

and estimating the marginal independence structure of a DAG model. There are

several contributions, which we break down in the following:

The combinatorics of unconditional equivalence. In Section 2, we provide a

framework for representing the marginal independence structure of a DAG using an

(undirected) unconditional dependence graph (UDG). UDGs were previously studied

in [

], in which characterizations of DAGs admitting the same marginal

independence structure were derived. In Theorem 2.2.3, we add to this theory by

providing four characterizations of the UDG of a DAG. We call the set of all DAGs

that have the same UDG an unconditional equivalence class (UEC). A UDG is thus

a representation of a UEC, which we call a UEC-representative.

Not all undirected graphs are UEC-representatives, but those that are possess

several useful combinatorial properties. In Theorem 2.3.4 we show that UEC-

representatives are exactly the undirected graphs whose independence number and

intersection number are equal. We further observe that UEC-representatives possess

a unique minimum edge clique cover that can be identiﬁed from any maximum

independent set of nodes in the graph. As a corollary, we show the generally NP-

Hard problems of computing the independence and intersection numbers (as well

as the associated maximum independent sets and minimum edge clique covers) are

solvable in polynomial time for UEC-representatives. This section is self-contained

and accessible given a background in graph theory.

The algebra of unconditional equivalence. In Section 3.2 we use our char-

acterization of UEC-representatives to deﬁne a toric ideal whose associated ﬁbers

MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS 3

contain all UEC-representatives with a speciﬁed set of “source nodes” and pairwise

intersections and unions of their neighbors. A quadratic and square-free reduced

Gr¨obner basis for this toric ideal is identiﬁed (Theorem 3.2.9). By the Fundamental

Theorem of Markov Bases [

], the Gr¨obner basis gives a set of moves for exploring

the UEC-representatives within a ﬁber. In Section 4, we extend these moves via

additional binomial operations to a set of moves that completely connects the space

of all UEC-representatives on

nodes (Theorem 4.3.1). The resulting connectivity

theorem yields a method for exploring the space of UEC-representatives in the

language of binomials, which is applied in Section 6.1 to estimate the marginal

independence structure of a DAG model. This section uses classic results on Gr¨obner

bases and toric ideals. It makes use of the results derived in Section 2.

Complexity reduction. Using the algebraic methods developed in Sections 3

and 4, we obtain a search algorithm over the space of UEC-representatives on

nodes in the language of polynomials. However, the polynomials used in this search

are computationally ineﬃcient when implemented directly. To reduce the complexity,

in Section 5.1, we introduce the DAG-reduction of a UEC-representative and prove

that the algorithm can be rephrased in terms of DAG-reductions so as to reduce

complexity. This reduction in complexity makes feasible an implementation of the

identiﬁed search method. To do this, it is shown that there exists DAGs in a given

UEC that are maximal in the UEC with respect to edge inclusion. We observe

that these maximal DAGs form a MEC contained within the UEC. It follows that

every UEC can be identiﬁed with a unique MEC of DAGs. The completed partially

directed acyclic graph (CPDAG) of this MEC is characterized, and then used to

produce the DAG-reduction of the UEC, which is more computationally eﬃcient

than the UEC-representative in terms of both time and space complexity. The

results in this section are accessible to readers with knowledge of graphical models.

MCMC estimation of the marginal independence structure of a DAG

model. In Section 6the DAG reduction search in Section 5is implemented in

the form of a Markov Chain Monte Carlo method, called GrUES (Gr¨obner-Based

Unconditional Equivalence Search).

GrUES

can completely explore the space of

UEC-representatives, thereby making possible the identiﬁcation of an optimal UEC-

representative for the data.

GrUES

also yields an estimate of the posterior distribution

of the UEC-representatives, allowing the user to quantify the uncertainty in the

estimated marginal independence structure.

In subsection 6.2, we apply

GrUES

to synthetic data generated from random linear

Gaussian DAG models to evaluate its performance empirically. It is benchmarked

against pairwise marginal independence testing, with performance evaluated for

varying numbers of nodes, graph sparsity and choices of prior, including a noninfor-

mative prior as well as a prior that allows the user to incorporate beliefs about the

number of source nodes in the data-generating causal system.

We observe that for relatively sparse or relatively dense models,

GrUES

successfully

identiﬁes the marginal independence structure of the data-generating model at a

rate higher than that achieved via simple independence tests. Highest Posterior

Density (HPD) credible sets are also estimated that give relatively ﬁne estimates

of the true UDG. These results suggest that

GrUES

provides an eﬀective method

for the estimation of the marginal independence structure of a DAG model, while

allowing for the ﬂexibility of incorporating prior knowledge about the causal system.

4 MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS

2. Unconditional Dependence

In this section we describe graphical representations for the marginal independence

structure of a data-generating distribution Markov to a DAG

. Our representative

of choice will be an undirected graph called the unconditional dependence graph of

the DAG. We ﬁrst begin with some necessary preliminaries.

2.1. Preliminaries. Given a positive integer

, we let [

]

{

, . . . , n}

. Let

= (

VD, ED

) be a directed acyclic graph (DAG) with node set

and edge set

. When it is clear from context, we write

and

for the nodes and edges of

respectively. If |V|=nthen the n×nmatrix AD= [av,w] in which

av,w =(1 if v→w∈E,

0 otherwise

is called the adjacency matrix of

. In an adjacency matrix, we identify

with [

]

and order the rows (columns) in increasing order from left-to-right (top-to-bottom).

The skeleton of a DAG

is the undirected graph given by forgetting edge directions

. For an undirected graph

= (

V, E

) on

nodes, the adjacency matrix of

= [

av,w

] where

av,w

aw,v

= 1 if

v−− w∈E

and

av,w

= 0 otherwise. Vertices

v, w ∈V

are called adjacent if

v→w

w→v

v−− w

is in

. Given an undirected

graph

and a vertex

v∈VU

we let

neU

(

) denote the open neighborhood of

, i.e.,

the set of nodes adjacent to

, and

neU

[

] :=

(

)

∪ {v}

be the closed neighborhood

. We write

(

) and

[

] when the graph

is understood. If

v→w∈E

then

is a parent of

and

is a child of

. A walk is a sequence of nodes (

v1, . . . , vm

)

such that

and

vi+1

are adjacent for all

i∈

[

m−

1]. A walk in which all nodes

are distinct is a path. A walk (

v1, . . . , vm

) in

is directed (from

) if

vi→vi+1 ∈E

for all

i∈

[

m−

1]. A directed walk in

is a directed path. If there is

a directed path from

we say

is an ancestor of

and

is a descendant

. For

A⊆V

we deﬁne the parents,children,ancestors, and descendants of

be the union over all parents, children, ancestors and descendants of all nodes in

respectively. We let

paD

(

)

,chD

(

)

,deD

(

)

and

anD

(

) denote the set of parents,

children, descendants and ancestors of

, respectively. Note that

v∈deD

(

)

and

v∈anD

(

). When the DAG

is understood, we drop the subscript

. A

collider is a pair of edges

t→u, w →u

(also written

t→u←w

). If

and

are

nonadjacent, then

t→u←w

is further called a v-structure. If a path contains the

edges

t→u

and

w→u

, then the vertex

is called a collider on the path. A path

is called blocked if it contains a collider. A colliderless path that does not repeat

any vertex is called a (simple) trek. (Note that trek has a more general deﬁnition

where the edges and vertices can be repeated. We will refer to this more general

trek as a colliderless walk.) Two subsets

and

are

-connected given

∅

and only if there is a trek between some

v∈A

and

w∈B

. We let

A̸⊥DB

denote

that Aand Bare d-connected given ∅in D. We say that Aand Bare d-separated

given ∅, denoted A⊥DB, if they are not d-connected given ∅.

Here, we only deﬁned d-connected and d-separated given the empty set, as this

will be suﬃcient for this paper. A more general deﬁnition in which

and

are

d-connected (d-separated) given a possibly nonempty set

is used to describe the

conditional independence relations associated to a DAG

. Given a DAG

= (

V, E

)

the DAG model

(

) is the collection of all distributions that are Markov to

(according to equation (1)).

MARGINAL INDEPENDENCE STRUCTURES UNDERLYING BAYESIAN NETWORKS 5

Theorem 2.1.1 (Lauritzen

[16]

).The distribution of (

X1, . . . , Xn

)belongs to

(

)

if and only if XA⊥⊥ XB|XCwhenever Aand Bare d-separated given Cin D.

An important observation to be made from the above theorem is that two diﬀerent

DAGs

and

D′

can satisfy

(

) =

(

D′

) since it is possible that

and

D′

have the same set of d-separation statements. Two such DAGs are called Markov

equivalent and are said to belong to the same Markov equivalence class (MEC).

2.2. The unconditional dependence graph of a DAG. When considering

jointly distributed random variables

= (

X1, . . . , Xn

), the term unconditional

independence or marginal independence refers to conditional independence statements

of the form

XA⊥⊥ XB|XC

where

∅

, i.e., the independence relations

XA⊥⊥ XB

that hold in the joint distribution. If

is Markov to a DAG

in which

and

are distinct source nodes of

then

Xi⊥⊥ Xj

. Hence, learning the marginal

independence structure of a model allows us to identify disjoint sets of nodes that

contain candidate source nodes for the DAG model, which can be useful in the

context of causal inference. This motivates the following deﬁnition.

Deﬁnition 2.2.1. The unconditional dependence graph of a DAG

= (

V, E

) is

the undirected graph UD= (V, {{v, w}:v̸⊥Dw;v, w ∈V}).

When the DAG

is clear from context, we write

for

. Similar to the case

of Markov equivalence of DAGs, it is possible that two distinct DAGs

and

D′

encode the same set of unconditional d-separation statements

v⊥w

. Two DAGs

= (

V, ED

) and

D′

= (

V, ED′

) are said to be unconditionally equivalent if whenever

two nodes

i, j ∈V

are

-separated given

∅

, the nodes

and

are d-separated

given

∅

D′

, i.e.,

UD′

. Markham et al.

[19

, Lemma 5

]

show that unconditional

equivalence is indeed an equivalence relation over the family of ancestral graphs (see

[

] for a deﬁnition) and consequently is also an equivalence relation over DAGs.

The collection of all DAGs that are unconditionally equivalent to

is called its

unconditional equivalence class (UEC). We represent each unconditional equivalence

class of DAGs by their unconditional dependence graph as

{U} :={D :UD=U}.

This is a collection of DAGs that is possibly diﬀerent from the MEC of

. Since

UECs are deﬁned in terms of a subset of the

-separations in a DAG, the partition

of DAGs on

nodes into MECs is a reﬁnement of the partition of DAGs into UECs;

i.e., each UEC can be written as a union of certain MECs. Hence, esimating the

unconditional dependence graph

of a DAG

gives a representative of all MECs

of DAGs that encode the same set of unconditional independence relations.

We now derive four characterizations of the unconditional dependence graph

of a DAG

to be used in methods for estimating the marginal independence

structure of

from data. These characterizations are presented in Theorem 2.2.3,

whose statement requires the following deﬁnitions:

We say that an ordered pair (

v, w

) (or an edge

v→w

) is implied by transitivity in

v∈anD

(

)

(

{w} ∪ paD

(

)). The set of maximal ancestors of

, denoted

maD

(

), is the set of all

v∈anD

(

) for which

anD

(

) =

{v}

. A node

v∈V

called a source node of

paD

(

) =

∅

. It follows that

maD

(

) is the collection of

all source nodes in

. We say an ordered pair (

v, w

) (or an edge

v→w

) is partially

weakly covered if

maD

(

)

⊆maD

(

v /∈anD

(

w /∈anD

(

) and

paD

(

)

∅

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

COMBINATORIALANDALGEBRAICPERSPECTIVESONTHEMARGINALINDEPENDENCESTRUCTUREOFBAYESIANNETWORKSDANAIDELIGEORGAKI,ALEXMARKHAM,PRATIKMISRA,ANDLIAMSOLUSAbstract.WeconsidertheproblemofestimatingthemarginalindependencestructureofaBayesiannetworkfromobservationaldata,learninganundirectedgraphwecalltheunconditio...

展开>> 收起<<

COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN NETWORKS.pdf

共54页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

COMBINATORIAL AND ALGEBRAIC PERSPECTIVES ON THE MARGINAL INDEPENDENCE STRUCTURE OF BAYESIAN NETWORKS

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: