FUNCTIONAL CONNECTOME OF THE HUMAN BRAIN WITH TOTAL CORRELATION Qiang Li

2025-05-08 0 0 5.36MB 21 页 10玖币
侵权投诉
FUNCTIONAL CONNECTOME OF THE HUMAN BRAIN WITH
TOTAL CORRELATION
Qiang Li
Image Processing Laboratory
University of Valencia
Valencia, 46980
qiang.li@uv.es
Greg Ver Steeg
Information Sciences Institute
University of Southern California
Marina del Rey, CA 90292
gregv@isi.edu
Shujian Yu
Machine Learning Group
UiT - The Arctic University of Norway
9037 Tromsø, Norway
shujian.yu@uit.no
Jesus Malo
Image Processing Laboratory
University of Valencia
Valencia, 46980
ABSTRACT
Recent studies proposed the use of Total Correlation to describe functional connectivity among brain
regions as a multivariate alternative to conventional pair-wise measures such as correlation or mutual
information. In this work we build on this idea to infer a large scale (whole brain) connectivity network
based on Total Correlation and show the possibility of using this kind of networks as biomarkers
of brain alterations. In particular, this work uses Correlation Explanation (CorEx) to estimate Total
Correlation. First, we prove that CorEx estimates of total correlation and clustering results are
trustable compared to ground truth values. Second, the inferred large scale connectivity network
extracted from the more extensive open fMRI datasets is consistent with existing neuroscience studies
but, interestingly, can estimate additional relations beyond pair-wise regions. And finally, we show
how the connectivity graphs based on Total Correlation can also be an effective tool to aid in the
discovery of brain diseases.
Keywords Total Correlation ·CorEx ·fMRI ·Functional Connectivity ·Large Scale Connectome ·Biomarkers
1 Introduction
The human brain is a complex system comprised of interconnected functional units. Millions of neurons in the brain
interact with each other at both a structural and functional level to drive efficient inference and processing in the brain.
Furthermore, the functional connectivity among these regions also reveals how they interact with each other in specific
cognitive tasks. Functional connectivity refers to the statistical dependency of activation patterns between various brain
regions that emerges as a result of direct and indirect interactions [1,2]. It is usually measured by how similar neural
time series are to each other, and it shows how the time series statistically interact with each other.
A variety of ways to analyze functional connectivity exist. A seed-wise analysis can be performed by selecting a
seed-driven hypothesis and analyzing its statistical dependencies with all other voxels outside its limits. It’s a common
tool for studying how different parts of the brain are connected to one another. Connectivity is determined by calculating
the correlation between the time series of each voxel in the brain and the time series of single seed voxel. Another
option is to perform a wide analysis of the voxel or region of interest (ROI), where statistical dependencies on all voxels
or ROIs are studied [3]. Structural connectivity refers to the anatomical organization of the brain by means of fiber
tracts [4]. The sharing of communication between neurons in multiple regions is coordinated dynamically via changes
in neural oscillation synchronizations [5]. When it comes to the brain connectome, functional connectivity refers to how
different areas of the brain communicate with one another during task-related or resting-state activities [6]. The use of
arXiv:2210.03231v4 [q-bio.NC] 14 Nov 2022
November 15, 2022
information-theoretic metrics can efficiently detect their interaction in dynamical brain networks, and it’s widely used
in the field of neuroscience [7]. For instance, quantify information encoding and decoding in the neural system [8
11],
measure visual information flow in the biological neural networks [12,13], and color information processing in the
neural cortex [14], and so on. However, although functional connectivity has already become a hot research topic in
neuroscience [15,16], systematic studies on the information flow or the redundancy and synergy amongst brain regions
remain limited. One extreme type of redundancy is full synchronization, where the state of one neural signal may be
used to predict the status of any other neural signal, and this concept of redundancy is thus viewed as an extension of
the standard notion of correlation to more than two variables [17]. Synergy, on the other hand, is analogous to those
statistical correlations that govern the whole but not its constituent components [18]. High-order brain functions are as-
sumed to require synergies, which give simultaneous local independence and global cohesion, but are less suitable for
them under high synchronization situations, such as epileptic seizures [19]. Most functional connectivity approaches
until now have mainly concentrated on pairwise relationships between two regions. The conventional approach used to
estimate indirect functional connectivity among brain regions is Pearson correlation (CC) [20] and Mutual Information
(I) [8,21
23]. However, real brain network relationships are often complex, involving more than two regions, and the
pairwise dependencies measured by correlation or mutual information could not reflect these multivariate dependencies.
Therefore, recent studies in neuroscience focus on the development of information-theoretic measures that can handle
more than two regions simultaneously such as the Total Correlation [24,25].
Total Correlation (TC) [26] (also known as multi-information [27
29]) mainly describes the amount of dependence
observed in the data and, by definition can be applied to multiple multivariate variables. Its use to describe functional
connectivity in the brain was first proposed as a empirical measure in [24], but in [25] the superiority of TC over
mutual information was proved analytically. The consideration of low-level vision models allows to derive analytical
expressions for the TC as a function of the connectivity. These analytical results show that pairwise I cannot capture the
effect of different intra-cortical inhibitory connections while the TC can. Similarly, in analytical models with feedback,
synergy can be shown using TC, while it is not so obvious using mutual information [25]. Moreover, these analytical
results allow to calibrate computational estimators of TC.
In this work we build on these empirical and theoretical results [24, 25] to infer a larger scale (whole brain) network
based on TC for the first time. As opposed to [24,25] where the number of considered nodes was limited to the range of
tens and focused on specialized subsystems, here we consider wider recordings [30,31] so we use signals coming from
hundreds of nodes across the whole brain. Additionally, applying our analysis to data of the same scale for regular
and altered brains
1
. We also show the possibility of using this kind of wide-range networks as biomarkers. From the
technical point of view, here we use Correlation Explanation (CorEx) [32,33] to estimate TC in these high-dimensional
scenarios. Furthermore, graph theory and clustering [15,16] are used here to represent the relationships between the
considered regions.
The rest of this paper is organized as follows: Section 2 introduces the necessary information-theoretic concepts and
explains CorEx. Sections 3 and 4 show two synthetic experiments that prove that CorEx results are trustable. Section 5
estimates the large-scale connectomes with fMRI datasets that involve more than 100 regions across the whole brain.
Moreover, we show how the analysis of these large scale networks based on TC may indicate brain alterations. Sections
6 and 7 give a general discussion and the conclusion of the paper, respectively.
2 Total Correlation as neural connectivity descriptor
2.1 Definitions and Preliminaries
Mutual Information:
Given two multivariate random variables
X1
and
X2
, the mutual information between them,
I(X1;X2), can be calculated as the difference between the sum of individual entropies, H(Xi)and the entropy of the
variables considered jointly as a single system, H(X1, X2)[34]:
I(X1;X2) = H(X1) + H(X2)H(X1, X2)(1)
where for each (multivariate) random variable
v
, the entropy is
H(v) = h− log2p(v)i
and the brackets represent
expectation values spanning random variables. The mutual information also can be seen as the information shared by
the two variables or the reduction of uncertainty in one variable given the information about the other [35].
Mutual information is better than linear correlation:
For Gaussian sources mutual information reduces to linear
correlation because the entropy factors in Eq. 1 just depend on
|hX1·X>
2i|
. However, for more general (non-Gaussian)
sources mutual information cannot be reduced to covariance and cross-covariance matrices. In these (more realistic)
1http://fcon_1000.projects.nitrc.org/indi/ACPI/html/
2
November 15, 2022
Figure 1:
Conceptual scheme of information theoretic measures of neural information flow.
The left circle areas
represent amounts of information, and intersections represent shared information among the corresponding variables,
X0, X1, X2
. Examples of entropy,
H(X0), H(X1), H(X2)
, total correlation (red color), and
T C[X0, X1, X2]
are
given. The middle figures show some neural time series are extracted from brain regions, which correspond to the nodes
in the right figure. The right figures illustrate large-scale time series in the brain and how the coupled information is
transmitted among the brain regions. The blue and green lines show linear correlation (CC) and mutual information (I),
respectively, between different parts of the brain. The modules represent the lobes of the human brain. Each module has
specific brain regions, and each module works with the others.
situations I is better than the linear correlation because I captures nonlinear relations that are ruled out by
|hX1·X>
2i|
.
For an illustration of the qualitative differences between I and linear correlation see the examples in Section 2.2 of [24].
As a result, mutual information has been proposed as a good alternative to linear correlation for estimating functional
connectivity [8,21]. However, mutual information cannot not capture dependencies beyond pairs of nodes. And this
may be a limitation in complex networks [36].
Total Correlation:
This magnitude describes the dependence among
n
variables and it is a generalization of the mutual
information concept from two parties to
n
parties. The Venn Diagram in Fig. 1 qualitatively illustrates this for three
variables. The definition of total correlation from Watanabe [26] can be denoted as,
TC (X1,...,Xn)
n
X
i=1
H (Xi)H (X1,...,Xn)=DKL p (X1,...,Xn)k
n
Y
i=1
p (Xi)!(2)
where
X(X1,...,Xn)
, and TC can also be expressed as the Kullback Leibler divergence,
DKL
between the joint
probability density and the product of the marginal densities. From these definitions, if all variables are independent
then TC will be zero.
The conditional total correlation, which is similar to the definition of total correlation but with a condition appended to
each term, The Kullback-Leibler divergence of the two conditional probability distributions can also be used to define
the conditional total correlation. The estimation method used in this work (CorEx presented in the next subsection) uses
the TC after conditioning on some other variable Y, which can be defined as [34],
T C(X|Y) = X
i
H(Xi|Y)H(X|Y) = DKL(p(x|y)k
n
Y
i=1
p(xi|y)) (3)
Total correlation is better than Mutual information:
This superiority is not only due to the obvious
n
-wise versus
pair-wise definitions in Eqs. 1 and 2. It also has to do with the different properties of these magnitudes. To illustrate this
point let us recall one of the analytical examples in [25]. Consider the following feedforward network:
X1//X2//e
f
//X3(4)
where the nodes
X1
,
X2
,
e
, and
X3
can have any number of neurons, the first two transforms,
X1//X2//e
,
are linear and affected by additive noise, and the last transform,
f(·)
, is nonlinear but deterministic. Imagine that in this
network one is interested in the connectivity between the neurons in the hidden layer,
e
, but the nonlinear function
f(·)
is unknown and one only has experimental access to the signal in the regions
X1
,
X2
and
X3
. In this situation one
3
November 15, 2022
could think on measuring
I(X1, X3) = I(X1, f(e))
or
I(X2, X3) = I(X1, f(e))
. However, the invariance of
I
under arbitrary nonlinear re-parametrization of the variables [35] implies that these measures are insensitive to
f
and
the connectivity there in. On the contrary, as pointed out in [25], using the expression for the variation of TC under
nonlinear transforms [13, 37], the variation of
H
under nonlinear transforms [34], and the definition in Eq. 2, one
obtains
T C(X1, X2, X3) = [T C(X1, X2,e)T C(e)] + T C(X3)
, where the term in the bracket does not depend on
f(·), but the last term definitely does, which proves the superiority of T C over Iin describing connectivity.
In [25] the network in Eq. 4 specifically refers to the flow from the retina,
X1
, to the LGN,
X2
, and finally to the visual
cortex,
e
and
X3
. However, the result of the superiority of
T C
over
I
to describe the connectivity in the hidden layer is
totally general for every network with the generic properties listed after Eq. 4.
2.2 Total Correlation estimated from CorEx
Straightforward application of the direct definition of TC is not feasible in high dimensional scenarios, and alternatives
are required [28,29]. A practical approach to estimate total correlation is via latent factor modelling. A latent factor
model is a statistical model that relates a set of observable variables to a set of latent variables. The idea is to explicitly
construct latent factors,
Y
, that somehow capture the dependencies in the data. If we measure dependencies via total
correlation,
T C(X)
, then we say that the latent factors explain the dependencies if
T C(X|Y) = 0
. We can measure
the extent to which Yexplains the correlations in Xby looking at how much the total correlation is reduced
T C(X)T C(X|Y) =
n
X
i=1
I(Xi;Y)I(X;Y)(5)
The total correlation is always non-negative, and the decomposition on the right in terms of mutual information can be
verified directly from the definitions. Any latent factor model can be used to lower bound total correlation, and the
terms on the right-hand side of Eq. 5 can be further lower-bounded with tractable estimators using variational methods,
and Variational Autoencoders (VAEs) are a popular example [38].
Although latent factor models do not give a direct total correlation estimation as the Rotation-based Iterative Gaussian-
ization (RBIG) [28,29] and the Matrix-based Rényi’s entropy [39] did, the approach can be complementary because the
construction of latent factors can help in dealing with the curse of dimensionality and for interpreting the dependencies
in the data. Compared to CorEx, the main goal of RBIG
2
is to convert any non-Gaussian distribution data into a
Gaussian distribution through marginal Gaussization and rotation to get TC. The Matrix-based Rényi’s entropy
3
is
mainly used for estimating multivariate information based on Shannon’s entropy, which is Rényi’s
α
order entropy [40].
With these goals in mind, we now describe a particular latent factor approach known as Total Cor-relation Ex-planation
(CorEx4) [32].
CorEx constructs a factor model by reconstructing latent factors using a factorized probabilistic function of the input
data,
p(y|x) = Qm
j=1 p(yj|x)
, with
m
discrete latent factors,
Yj
. This function is optimized to give the tightest lower
bound possible for Eq. 5.
T C(X)max
p(Yj|x)
n
X
i=1
I(Xi;Y)I(X;Y) =
m
X
j=1 n
X
i=1
αi,j I(Xi;Yj)I(Yj;X)!(6)
The factorization of the latent leads to the terms
I(X;Y) = PjI(Yj;X)
which can be directly calculated. The term
I(Xi;Y)
is still intractable and is decomposed using the chain rule into
I(Xi;Y)Pαi,j I(Xi;Yj)
. Each
I(Xi;Yj)
can then be tractably estimated [32,33]. There are free parameters
αi,j
that must be updated while searching for latent
factors and achieving objective functions. When t= 0, the αi,j initializes and then updates according to:
αt+1
i,j = (1 λ)αt
i,j +λα∗∗
i,j (7)
The second term
α∗∗
i,j = exp (γ(I(Xi:Yj)maxjI(Xi:Yj)))
,
λ
and
γ
are constant parameters. This decompo-
sition allows us to quantify the contribution to the total correlation bound from each latent factor, which we can aid
interpretability.
2https://isp.uv.es/RBIG4IT.htm
3http://www.cnel.ufl.edu/people/people.php?name=shujian
4https://github.com/gregversteeg/CorEx
4
November 15, 2022
CorEx can be further extended into a hierarchy of latent factors [33], helping to reveal hierarchical structure that we
expect to play an important role in the brain. The latent factors at layer
k
explain dependence of the variables in the
layer below.
T C(X)
r
X
k=1
m
X
j=1
n
X
i=1
αk
i,j I(Yk1
i;Yk
j)
m
X
j=1
I(Yk
j;Yk1)
(8)
Here
k
gives the layer and
Y0X
denotes the observed variables. Ultimately, we have a bound on TC that gets tighter
as we add more latent factors and layers and for which we can quantify the contribution for each factor to the bound.
We exploit this decomposition for interpretability [41] as illustrated in Fig. 2. CorEx prefers to find modular or tree-like
latent factor models which are beneficial for dealing with the curse of dimensionality [42]. For neuroimaging, we
expect this modular decomposition to be effective because functional specialization in the brain are often associated
with spatially localized regions. We explore this hypothesis in the experiments.
Figure 2:
CorEx learns a hierarchical latent factor as illustrated above.
Edge thickness indicates strength of
relationship between factors, and node thickness indicates how much total correlation is explained by each latent factor.
3 Experiment 1: Total Correlation for independent mixtures
In this experiment, we estimate the total correlation of three independent variables
X
,
Y
and
Z
, and each follows a
Gaussian distribution. For this setup, the ground truth of TC should satisfy
T C(X, Y, Z)=0
, and generated various
samples with different lengths. Then estimated total correlation values are shown in the Fig 3. Here, we compared
CorEx with other different total correlation estimators, such as, RBIG [28,29], Matrix-based Rényi’s entropy [39],
Shannon discrete entropy
5
, and ground truth. The left figure (2 dimensional) is mutual information, and the middle
(3 dimensional) and right figure (4 dimensional) are total correlation. As we mentioned above, the simulation data is
totally Gaussian distributed. Therefore, their dependency should be zero. We find that CorEx and RBIG both perform
very well and are very stable, and matrix-based Renyi entropy performance becomes more and more nice with increased
dimensions, while Shannon discrete entropy becomes more and more accurate with an increase of samples. All these
make sense, and it also explains the accuracy of total correlation estimation with CorEx. Here, compared to other
estimators, the main functionality goal of CorEx is to cluster statistical dependency variables based on total correlation.
However, other estimators mainly focus on directly getting the total correlation value and do not supply very nice
visualization results. The CorEx gives us a nice connection with graph theory to visualize and show their functional
relationship.
4 Experiment 2: Clustering by Total Correlation for dependent and independent mixtures
To evaluate the performance of CorEx in clustering tasks. The elements in group
X
include
X1
,
X2
, and
X3
, which
satisfy Gaussian distributions and are completely independent from each other and from group
Y
, and variables in
5https://github.com/nmtimme/Neuroscience-Information-Theory-Toolbox
5
摘要:

FUNCTIONALCONNECTOMEOFTHEHUMANBRAINWITHTOTALCORRELATIONQiangLiImageProcessingLaboratoryUniversityofValenciaValencia,46980qiang.li@uv.esGregVerSteegInformationSciencesInstituteUniversityofSouthernCaliforniaMarinadelRey,CA90292gregv@isi.eduShujianYuMachineLearningGroupUiT-TheArcticUniversityofNorway90...

展开>> 收起<<
FUNCTIONAL CONNECTOME OF THE HUMAN BRAIN WITH TOTAL CORRELATION Qiang Li.pdf

共21页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:21 页 大小:5.36MB 格式:PDF 时间:2025-05-08

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 21
客服
关注