Statistical Learning Methods for Neuroimaging Data Analysis with Applications Hongtu Zhu1 Tengfei Li2 and Bingxin Zhao3

2025-05-03 0 0 2.02MB 75 页 10玖币
侵权投诉
Statistical Learning Methods for Neuroimaging Data
Analysis with Applications
Hongtu Zhu1, Tengfei Li2, and Bingxin Zhao3
1Departments of Biostatistics, Statistics, Genetics, and Computer Science and
Biomedical Research Imaging Center, University of North Carolina, Chapel Hill
2Departments of Radiology and Biomedical Research Imaging Center, University
of North Carolina, Chapel Hill
3Department of Statistics and Data Science, University of Pennsylvania
1
arXiv:2210.09217v1 [stat.AP] 17 Oct 2022
Abstract
The aim of this paper is to provide a comprehensive review of statistical challenges in
neuroimaging data analysis from neuroimaging techniques to large-scale neuroimaging studies
to statistical learning methods. We briefly review eight popular neuroimaging techniques and
their potential applications in neuroscience research and clinical translation. We delineate
the four common themes of neuroimaging data and review major image processing analysis
methods for processing neuroimaging data at the individual level. We briefly review four
large-scale neuroimaging-related studies and a consortium on imaging genomics and discuss
four common themes of neuroimaging data analysis at the population level. We review nine
major population-based statistical analysis methods and their associated statistical challenges
and present recent progress in statistical methodology to address these challenges.
Keywords: causal pathway, heterogeneity, image processing analysis, neuroimaging
techniques, population-based statistical analysis, study design.
1 Introduction
Neuroimaging refers to the process of producing images of the structure, function, or pharmacol-
ogy of the central nervous system (CNS). It has been a dynamic and evolving field with (A1) the
development of new acquisition techniques, (A2) the collection of various neuroimaging data in
clinical settings and medical research, and (A3) the development of statistical learning (SL) meth-
ods. For (A1), popular neuroimaging techniques include structural magnetic resonance imaging
(sMRI), functional magnetic resonance imaging (fMRI), diffusion weighted imaging (DWI), com-
puterized tomography (CT), positron emission tomography (PET), electroencephalography (EEG),
magnetoencephalography (MEG), and functional near-infrared spectroscopy (fNIRS). These tech-
niques were developed to measure specific tracers in CNS, that are directly and indirectly associ-
ated with brain structure and function. For instance, PET delineates how an injected radioactive
tracer (e.g., Fluorodeoxyglucose (FDG)) moves and accumulates in the brain, whereas fMRI mea-
sures an indirect tracer, called the concentration of deoxyhaemoglobin, in the flow downstream of
2
the activated neurons caused by brain’s activity. The developments of SL methods for individual
neuroimaging data raise serious challenges for existing statistical methods due to four common
themes consisting of (CT1) complex brain objects,(CT2) complex spatio-temporal structures,
(CT3) extremely high dimensionality, and (CT4) heterogeneity within subjects and across groups.
For (A2), in recent years, huge amounts of neuroimaging data have been collected in health
care, biomedical research studies, and clinical trials. First, neuroimaging has the potential to im-
prove clinical care for diagnosis and prognosis in various brain-related diseases, such as demen-
tia, sleep disorders, and schizophrenia. Some typical uses of neuroimaging include identifying
the effects of brain-related diseases (e.g., stroke or glioblastoma), locating cysts and tumors, and
finding swelling and bleeding, among others. Second, many large-scale biomedical studies have
collected/are collecting massive neuroimaging data (e.g., sMRI, DWI, and fMRI) with high spa-
tial and/or temporal resolution as well as other complex information (e.g., genomics and health
factors) in order to map the human brain connectome for understanding the pathophysiology of
brain-related disorders, the progress of neuropsychiatric and neurodegenerative disorders, the nor-
mal brain development, and the diagnosis of brain cancer, among others. In the last two decades,
there are at least three pioneering neuroimaging-related studies, including Alzheimer’s Disease
Neuroimaging Initiative (ADNI) (http://www.adni-info.org/) (Weiner et al., 2010), the Human
Connectome Project (HCP) (http://humanconnectome.org/consortia/) (Van Essen et al., 2013), and
the UK Biobank (UKB) study (https://www.ukbiobank.ac.uk/) (Miller et al., 2016). They repre-
sent major advances and innovations in acquisition protocols, analysis pipelines, data management,
experimental design, and sample size. The left panel of Figure 1 shows multi-view data across dif-
ferent domains (e.g., imaging, genetics, or environmental factors) in some large-scale biomedical
studies. Third, neuroimaging biomarkers have many uses in clinical trials for drug development
in neurological and psychiatric disorders (Schwarz, 2021). These uses include a screening tool for
selecting trial participants, a tool to establish biodistribution, target engagement and pharmacody-
namic activity, a means for monitoring safety, and an evidence measure of disease modification.
3
The developments of SL methods for clinical translation and large-scale neuroimaging-related
studies raise serious challenges to existing statistical methods due to the four additional themes of
(CT5) sampling bias,(CT6) complex missing patterns,(CT7) complex data objects, and (CT8)
complicated causal pathways in brain disorders.
For (A3), there is a large literature on the development of SL methods for neuroimaging data
analysis (NDA) in order to correlate multi-type data from different domains across multiple stud-
ies, eventually establishing a dynamic causal pathway (e.g., the causal genetic-imaging-clinical
(CGIC) pathway in the right panel of Figure 1) linking genetics to brain (or neuroimaging) pheno-
types to clinical outcomes confounded with health factors. These SL methods can be categorized
into two categories including image processing analysis (IPA) at individual level and population-
based statistical analysis (PSA) for a sample of subjects. We further group various IPA methods
into deconvolution and structure learning (Sotiras et al., 2013; Li et al., 2019; Zhou et al., 2021;
Shen et al., 2017; Park et al., 2003; Yi et al., 2019). Deconvolution methods primarily include
image reconstruction and image enhancement. Structure learning methods mainly include image
segmentation and image registration. Due to (CT1)-(CT4) and the lack of high-quality annotation
datasets, it is very challenging to develop ‘good’ IPA pipelines to extract a relatively small number
of image phenotypes (IPs) with high repeatability and reproducibility for both individual health
care and PSA. We also group various PSA methods into nine main categories, including study de-
sign, statistical parametric mapping, object oriented data analysis, dimensional reduction methods,
data integration, imputation methods, predictive models, imaging genetics, and causal discovery
(Ombao et al., 2016; Nathoo et al., 2019; Shen and Thompson, 2019; Smith and Nichols, 2018;
Nichols et al., 2017; Rathore et al., 2017). Due to (CT1)-(CT8), each category has its own statistical
challenges, requiring specific statistical methodology to address them. However, the development
of scalable PSA methods has fallen seriously behind the technological advances in neuroimaging
techniques, causing difficulty in translating research findings to clinical practice.
4
Environment
E
Genomics
G
Imaging
I
Cognition Disease
Unseen
Confounder
U
Confounder
O
Unseen
Confounder
U
Unseen
Confounder
U
Behavior
Genetics
Environment
Lifestyle
SES
Imaging
Confounders
Observed
Unobserved
Clinical
Disease Cognition Behavior
UKB 500000
ABCD 10000+
HCP Lifespan 4000+
ADNI 2000+
Figure 1: Left panel: Major data types from different domains in several representative large-scale
biomedical studies; Right panel: A dynamic causal model for delineating causal genetic-imaging-
clinical (CGIC) pathway confounded with environmental factors and unobserved confounders.
2 A Review of Neuroimaging Techniques and Uses
We briefly review eight neuroimaging techniques below and summarize them in supplementary
Table 1. For each image modality, we describe its tracer, data dimension, features, main uses,
and several key softwares (Smith and Webb, 2010). Figure 2 also presents different neuroimaging
modalities and the different types of features they extract.
Structural magnetic resonance imaging (sMRI) measures the fluid characteristics of different
tissues (gray and white matter), creating high-resolution (0.5mm-1mm) images with a strong
gray/white matter contrast and many anatomical details. It allows us to qualitatively and
quantitatively measure the development and change of cortical and subcortical structures in
terms of both size and shape in the brain. Some sMRI derived measurements include cortical
thickness, cortical folding, sulcal depth, voxel-based morphometry, and regional volumes
and shape. sMRI has been widely used for diagnosis, staging, and follow-up of disease in
clinics and brain development in research.
5
摘要:

StatisticalLearningMethodsforNeuroimagingDataAnalysiswithApplicationsHongtuZhu1,TengfeiLi2,andBingxinZhao31DepartmentsofBiostatistics,Statistics,Genetics,andComputerScienceandBiomedicalResearchImagingCenter,UniversityofNorthCarolina,ChapelHill2DepartmentsofRadiologyandBiomedicalResearchImagingCenter...

展开>> 收起<<
Statistical Learning Methods for Neuroimaging Data Analysis with Applications Hongtu Zhu1 Tengfei Li2 and Bingxin Zhao3.pdf

共75页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:75 页 大小:2.02MB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 75
客服
关注