Measure-Theoretic Probability of Complex Co-occurrence and E-Integral

2025-05-02 0 0 626.46KB 80 页 10玖币
侵权投诉
arXiv:2210.09913v1 [stat.ML] 18 Oct 2022
MEASURE-THEORETIC PROBABILITY OF COMPLEX CO-OCCURRENCE
AND E-INTEGRAL
BYJIAN-YONG WANG1AND HAN YU2,*
1School of Mathematical Sciences, Xiamen University, Xiamen 361005, China. jywang@xmu.edu.cn
2Department of Applied Statistics and Research Methods, University of Northern Colorado, Greeley, CO 80639, USA.
*han.yu@unco.edu
Complex high-dimensional co-occurrence data are increasingly popular
from a complex system of interacting physical, biological and social pro-
cesses in discretely indexed modifiable areal units or continuously indexed
locations of a study region for landscape-based mechanism. Modeling, pre-
dicting and interpreting complex co-occurrences are very general and fun-
damental problems of statistical and machine learning in a broad variety of
real-world modern applications. Probability and conditional probability of
co-occurrence are introduced by being defined in a general setting with set
functions to develop a rigorous measure-theoretic foundation for the inher-
ent challenge of data sparseness. The data sparseness is a main challenge
inherent to probabilistic modeling and reasoning of co-occurrence in statisti-
cal inference. The behavior of a class of natural integrals called E-integrals
is investigated based on the defined conditional probability of co-occurrence.
The results on the properties of E-integral are presented. The paper offers
a novel measure-theoretic framework where E-integral as a basic measure-
theoretic concept can be the starting point for the expectation functional ap-
proach preferred by Whittle (1992) and Pollard (2001) to the development of
probability theory for the inherent challenge of co-occurrences emerging in
modern high-dimensional co-occurrence data problems and opens the doors
to more sophisticated and interesting research in complex high-dimensional
co-occurrence data science.
1. Introduction. A large number of events occurs simultaneously in practice from a
complex system of interacting physical, biological and social processes in discretely indexed
modifiable areal units or continuously indexed locations of a study region. Heterogeneous
high-dimensional data are nowadays rule in health, medicine, epidemiology, technology,
econometrics, business, finance, sociology, and political science, to name just a few. In public
health study, multimorbidity, the co-existence of multiple chronic conditions in an individual
in different time periods [25,38,37], has been identified as one of the major health system
concerns of the twenty-first century [41,47]. Not only older adults but also a substantial num-
ber of young and middle-aged people also have multimorbidity [6,2,21]. In computer vision
and text mining, words and images are collected as co-occurrence data to match textual in-
formation hidden in words and visible information hidden in images defined by a network
for interpretation [9,45,59]. In social sciences, the semantic network of words appearing
together in the texts, such as newspaper articles, political speeches, novels and fiction, or
transcripts of debates, has been emerging as a basis to test existing hypotheses, formulate the-
ories, understand the prevailing discourses, and tailor their messages more effectively within
our society [56].
MSC2020 subject classifications:Primary 60A05, 60A10; secondary 60C05.
Keywords and phrases: High Dimensional Co-occurrence, Conditioning, E-integral, Expectation Approach,
Nonparametric Structural Equation Models, Kernel, Data Sparseness.
1
2
(Ω,F,P)
M=hI,J,X,E, f, PEi(Xi,Ai,PXi)
inference
Xi
explanation
Co-occurrence: iI
Model
FIGURE 1. Mechanism-based explanation and interpretation of co-occurrences from a fixed probability space
(Ω,F,P)that sits in the background.
Modeling, interpreting and predicting co-occurrences of events is a general and funda-
mental problem of statistical and machine learning. It has a wide variety of real-world mod-
ern applications in information retrieval, natural language processing, computer vision, data
mining, remote sensing data analysis, joint models of different types of distributions, mea-
surement error models, space-time coregionalization models, spatio-temporal dynamic mod-
els. Much of research questions in these modern applications are an attempt to infer co-
occurrence relationships in the context of probabilistic modeling and analysis of association
and causation with statistical evidence collected under conditions where fully randomized
controlled experiments are not possible. The ultimate goal of statistical modeling is to in-
terpret and explain the co-occurrence data with a probabilistic model illustrated in Figure 1.
Furthermore, one is often considerably interested in inferring causal and counterfactual links
between variables beyond association in sciences. The causal relationships between the vari-
ables are expressed in the configuration of deterministic, functional relationships and proba-
bilities are introduced through the assumption that certain variables are exogenous latent ran-
dom variables. When working with mechanism-based interpretation and explanation of the
co-occurrence data [29], advanced spatio-temporal structural equation models (SEM) can be
considered as fundamental representations of the complex co-occurrence problems. In order
to formulate such a question in general, we consider structural causal models (SCMs), also
known as nonparametric structural equation models (NPSEMs). A structural causal model is
a tuple
(1) M=hI,J,X,E, f, PEi,
where Iis an index set of endogenous variables, Jis an index set of exogenous variables,
X= Πi∈I Xi, or just XIfor short, is the product space of the endogenous variables with Xi
3
being a measurable space to set interventions, E= Πj∈J Ej, or just EJfor short, is the prod-
uct space of the exogenous variables with Ejbeing a measurable space, PEis a probability
measure on Efor the exogenous unstructural disturbance noise, and f:X × E X is a mea-
surable function that specifies the causal mechanism encoded in structural equations [12].
This modeling allows us to represent interventions in an unambiguous way by changing the
causal mechanisms that target specific endogenous variables as well as encode the structural
properties of the functional relations in a graph with index sets. A pair of random variables
(XI, EJ)is a solution of the SCM M=hI,J,X,E, f, PEiif the perturbation is equal to
PEJ(i.e., PEJ=PEJ) and the structural equations are satisfied (i.e., XI=f(XI, EJ)a.s.).
The endogenous XIis observable, while the exogenous random variables EJare latent. For
a solution XI, we call the distribution PXIthe observational distribution of Massociated to
XI. SCMs arise in genetics [66], econometrics [28], electrical engineering [39,40] and the
social sciences [18,26]. SCMs are widely used for causal modeling [11,49,51,57] and the
corresponding statistical methods are developed for causal inference [13,34,43,44,52].
The types of research questions in these areas continues to increase in their complexity
with the advance of technology. As the higher order co-occurrences, e.g. co-occurrence in
triples, quadruples, etc, are observed, the intrinsic data sparseness problem of co-occurrence
data becomes more urgent than that of co-occurrence in pairs [31]. The complexity of encod-
ing the model and describing the raw data conditioned on that model is encoded in a class
of index sets. The class of index sets includes parents of variables and sufficient set or ad-
missible set for adjustment in the causal inference context [49]. The class of index sets plays
a critical role in the identifying assumptions underlying all causal inferences, the languages
used in formulating the assumptions, the conditional nature of all causal and counterfactual
claims, and the methods developed for the assessment of such claims [50]. Index sets are also
critical in capturing the significant variations important to the process being modeled and
understanding what is measured and perceived [24]. A physical, biological or social process
cannot be modeled and identified successfully to answer causal questions unless data are
available at appropriate indices and their structure. Many researchers have shown that dif-
ferent scales of index set—hence aggregations— often lead to contradictory interpretations,
such paradoxes being referred to as “ecological fallacies" [55,32] or known as the modi-
fiable areal unit problem (MAUP) in spatial analysis and geographical information science
[48,3,59]. The measure-theoretic treatment of the sparseness problem provides insights into
the problem of co-occurrence data in a unified foundation. In the culture of data science, we
pursue the fundamental interpretation and understanding of scientific problems arising from
co-occurrence events.
Let (Ω,F, P )be a given probability space. The space might in practice be geographic
space, or socio-economic space, or more generally network space as abstraction of reality.
We usually call an element AFan event and P(A)is the probability of occurrence of
event Aor abbreviated as the probability of event A. In many classical books on probability
or measure (see, e.g., [8], [19], [10], [67], [15], [4]), the definitions of conditional probability
and conditional expectation are well given. For example, given A, B Fwith P(B)>0,
the conditional probability of event Agiven event Bis P(AB)/P (B). Note that, since
ABF,ABis also an event that implies events Aand Boccur simultaneously.
P(AB)is thus naturally called the probability of the event that events Aand Boccur
simultaneously. We will extend P(AB)as the probability of co-occurrence of Aand Bin
Section 2to accommodate the complex problems of co-occurrence emerging in modern data
science and proceed to the fundamental measure-theoretic treatment.
We deal with lots of σ-fields in modern applications, not just the one σ-field which is the
concern of measure theory. Consider a random object X(ω)on a given probability space
(Ω,F, P ). There exists an objective measurable space (Ω1,F1)such that X: 1is
4
F\F1-measurable [4]. As an illustration, take Xas the identity mapping Ion , then
Iis the unique mapping from to such that I(ω) = ω, ω. Further let Ibe an
identity random object on equipped with F, then there exists an objective measurable
space (Ω,G), where Gis a sub σ-field of F, such that the identity mapping Ion (Ω,F, P )
is F\G-measurable. Whenever a clear distinction is needed, the identity random object from
(Ω,F)to (Ω,F)is denoted by I(F,F), just IFfor short or I0for short when emphasized
as a reference, and from (Ω,F)to (,G) (GF)denoted by I(F,G), or just IGfor short.
In other words, the identity random object on (Ω,F, P )is not unique since it is obvious that
I(F,G)is different from I(F,F). If X(ω)is an random object from (Ω,F)to (Ω,F),
then for a sub σ-field Gof F, we denote XGthe random object from (Ω,F)to (Ω,G)
such that XG(ω) = X(ω),ω. Then XGis different from XF.
Contributions. Motivated by the important role of sets of indices in formulating the iden-
tifying assumptions underlying causal inferences, the languages used in articulating the as-
sumptions, the conditional nature of all causal and counterfactual claims and the methods
developed for the assessment of such claims [49], we introduce in a general setting the intu-
itive definitions of probability of co-occurrence and conditional probability of co-occurrence,
which is in the science-friendly vocabulary and interpretation for explicating structural as-
sumptions in practice. Technically speaking, the best treatment of complex co-occurrence
starts with classes of indices for the conditional nature of inference about a fixed probability
space (Ω,F,P)and thus can fill the nontrivial gap between the measure-theoretic founda-
tion and the structural causal model (1) proposed by Bongers et al. [12] and hierarchical
mixture models (HMMs) proposed by Hofmann and Puzicha [31] for co-occurrence data.
On top of that, many probabilistic ideas are greatly simplified by reformulation as properties
of sigma-fields [54]. It has been shown that the best treatment of independence starts with
the concept of independent sub σ-fields of Ffor a fixed probability space (Ω,F,P). Then
the corresponding results with respect to conditional probability and conditional expectation
can be described in a unified semantic and mathematical framework in the presence of the
heterogeneity of random objects. Take high-dimensional data as an example, a target random
object in the middle plate of Figure 1can be heterogeneous with a set of indices I1for visible
information in images, a set of indices I2for audible information in speech, a set of indices
I3for textual information in words from a complex system; a target random object in the
right plate of Figure 1can be heterogeneous with a set of indices I1for observed variables
and a set of indices I2for counterfactual variables for augmented probability measure in the
Neyman-Rubin potential outcome framework for causal inference; a target random object in
the left plate of Figure 1can be heterogeneous with a set of indices I1for endogenous vari-
ables and a set of indices I2for exogenous variables for augmented probability measure in the
structural theory of causation. Furthermore, as an extension of conditional probability, a class
of natural integrals called E-integrals is introduced as being the preferred expectation oper-
ator approach to develop the theory of probability for a wider variety of important modern
applications in optimization problems, quantum mechanics, information theory and statisti-
cal mechanics [64,65,54]. The properties of E-integral are investigated. With the rigorous
measure-theoretic approach, certain measure problems can be cleanly resolved regarding the
complex problem of many heterogenous co-occurrences emerging in modern data science in
general and data spaces, loss functions, and statistical risk arising in machine learning in par-
ticular. For its own sake, the paper provides new insights into the inherent challenge of data
sparseness underlying the modeling complex co-occurrence data problems in higher order
co-occurrences and opens exciting new research directions. As an extended conditionality
principle, the paper provides a rigorous foundation by presenting the measure-theoretic re-
sults for the active areas of research on the modeling complex co-occurrence data with SCMs
and HMMs.
5
Outline. The paper is structured as follows. We introduce the definition and properties of
probability of co-occurrence and conditional probability of co-occurrence as set functions in
the most elementary form, without imposing any parameterizations on them, to pursue the
fundamental measure-theoretic treatment of the complex co-occurrence problems in Section
2. In Section 3, we examine the transformation of conditional probability of co-occurrence
defined in Definition 2.12. In Section 4, we provide the definition of the probability density
of co-occurrence with respect to its product probability measure. In section 5, we further
devote to investigate the integrals with respect to the measure for probability and conditional
probability of co-occurrence. Then E-integral is developed for the occurrence of many events
to deal with the complexity of the co-occurrence problems in Section 6. Our goal is to provide
a digestible narrative and postpone for this reason all proofs and most of the technical material
to an appendix. The Supplementary Material [63] provides the complete proofs of all the
theoretical results in the main text. Appendix A contains the lemmas and theorems that are
used in several proofs. Appendix B contains all the other proofs of all the theoretical results
in the main text.
2. Probability and Conditional Probability of Co-occurrence.
2.1. Notation. Let Nbe the set of natural numbers, Rthe set of real numbers, R+the
set of nonnegative real numbers, and R= [−∞,+]. Let Λbe a nonempty index set and
I1, I2Λ, then I1+I2:= I1I2provided that I1I2=.
Let (Ω,F, P )be a given probability space, (Ωi,Fi), i Λbe measurable spaces and Xi:
(Ω,F)(Ωi,Fi), i Λbe random objects. For any IΛ, denote the product space I:=
QiIi, where ωI:= (ωi:ωii, i I)Iin general and ωI:= (ωi, i I)provided
that Iis a countable index set in particular, and FI:= QiIFi. Then the measurable product
space (ΩI,FI) := (QiIi,QiIFi). Denote XI(ω) := (Xi(ω) : iI)as a random object
from (Ω,F)to (ΩI,FI).
Let (Ωi,Fi),iI, be measurable spaces. If I={1,2}, let X: (Ω,F)(Ω1,F1)and
Y: (Ω,F)(Ω2,F2)be random objects, i.e., X(ω)and Y(ω)are F\F1-measurable
and F\F2-measurable, respectively. The probability measures induced by Xon (Ω1,F1)
and Yon (Ω2,F2)are denoted as P[X]and P[Y], respectively. The functional relationships
among XIhinge on the graph of Λthat encodes a semantic network. Thus in our presentation
involving a set of Is, the notation P[]is a flexible alternative to Pfor symbolic manip-
ulation whenever XIinvolves complex structure of Ifor PXIand will be used whenever
operation is needed on the high-dimensional random object Xwith complex heterogenous
structure. In other words, a pair of square brackets serves to describe and manipulate the
complexity of the index set Iinherent in a probability measure while a pair of round brackets
is kept for events in a σ-field. In addition, "almost everywhere with respect to the measure µ"
is abbreviated to "a.e.{µ}".
2.2. Definitions and Theorems.
DEFINITION 2.1. Let (Ω,F, P )be a given probability space and IN. Let AiF, i
I, then P(TiIAi)is called the probability of co-occurrence of events Ai, i I, and denoted
by P[Ai, i I], i.e.,
P[Ai, i I] = P(\
iI
Ai).
Given AiF, i I1and BjF, j I2, where I1, I2N, then the probability of co-
occurrence of all events Ai, i I1and Bj, j I2is denoted by P[Ai, i I1;Bj, j I2],
摘要:

arXiv:2210.09913v1[stat.ML]18Oct2022MEASURE-THEORETICPROBABILITYOFCOMPLEXCO-OCCURRENCEANDE-INTEGRALBYJIAN-YONGWANG1ANDHANYU2,*1SchoolofMathematicalSciences,XiamenUniversity,Xiamen361005,China.jywang@xmu.edu.cn2DepartmentofAppliedStatisticsandResearchMethods,UniversityofNorthernColorado,Greeley,CO806...

展开>> 收起<<
Measure-Theoretic Probability of Complex Co-occurrence and E-Integral.pdf

共80页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:80 页 大小:626.46KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 80
客服
关注