1 Inference and Prediction Using Functional Principal Components Analysis Application to Diabetic Kidney Disease Progression in the Chronic Renal Insufficiency Cohort CRIC Study

2025-04-30 0 0 850.42KB 34 页 10玖币
侵权投诉
1
Inference and Prediction Using Functional Principal Components Analysis: Application to
Diabetic Kidney Disease Progression in the Chronic Renal Insufficiency Cohort (CRIC)
Study
Brian Kwan1,2,
a
, Wei Yang3, Daniel Montemayor4,5, Jing Zhang2, Tobias Fuhrer6,
b
, Amanda H.
Anderson7, Cheryl A.M. Anderson8, Jing Chen9, Ana C. Ricardo10, Sylvia E. Rosas11, Loki
Natarajan1,2, and the CRIC Study Investigators*
1Division of Biostatistics and Bioinformatics, Herbert Wertheim School of Public Health, University of California,
San Diego, La Jolla, CA, USA;
2Moores Cancer Center, University of California, San Diego, La Jolla, CA, USA;
3Department of Biostatistics, Epidemiology and Informatics, Center for Clinical Epidemiology and Biostatistics,
Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA;
4Division of Nephrology, Department of Medicine, University of Texas Health San Antonio, San Antonio, TX,
USA;
5Center for Renal Precision Medicine, University of Texas Health San Antonio, San Antonio, TX, USA;
6Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland;
7Department of Epidemiology, Tulane School of Public Health and Tropical Medicine, New Orleans, LA;
8Herbert Wertheim School of Public Health, University of California, San Diego, La Jolla, CA, USA;
9Department of Medicine, Tulane School of Medicine, New Orleans, LA;
10Department of Medicine, University of Illinois, Chicago, IL, USA;
11Joslin Diabetes Center and Harvard Medical School, Boston, MA, USA;
*A list of the CRIC Study Investigators appears in the Acknowledgements.
Correspondence: Loki Natarajan, Division of Biostatistics, Herbert Wertheim School of Public Health, University
of California San Diego, 3855 Health Sciences Dr #0901, La Jolla, CA 92093, USA. Email: lnatarajan@ucsd.edu.
Moved new affiliations:
a Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles, Los Angeles,
CA, USA
b
Biostarks, Geneva, Switzerland
2
Abstract
Repeated longitudinal measurements are commonly used to model long-term disease
progression, and timing and number of assessments per patient may vary, leading to irregularly
spaced and sparse data. Longitudinal trajectories may exhibit curvilinear patterns, in which
mixed linear regression methods may fail to capture true trends in the data. We applied
functional principal components analysis to model kidney disease progression via estimated
glomerular filtration rate (eGFR) trajectories. In a cohort of 2641 participants with diabetes and
up to 15 years of annual follow-up from the Chronic Renal Insufficiency Cohort (CRIC) study,
we detected novel dominant modes of variation and patterns of diabetic kidney disease (DKD)
progression among subgroups defined by the presence of albuminuria. We conducted inferential
permutation tests to assess differences in longitudinal eGFR patterns between groups. To
determine whether fitting a full cohort model or separate group-specific models is more optimal
for modeling long-term trajectories, we evaluated model fit, using our goodness-of-fit procedure,
and future prediction accuracy. Our findings indicated advantages for both modeling approaches
in accomplishing different objectives. Beyond DKD, the methods described are applicable to
other settings with longitudinally assessed biomarkers as indicators of disease progression.
Supplementary materials for this article are available online.
Keywords: functional data analysis, longitudinal study, model fit, permutation tests, renal
disease, sparse data
3
1 Introduction
Diabetes mellitus is the leading cause of chronic kidney disease (CKD) (Bailey et al.
2014; Centers for Disease Control and Prevention 2017; Koro et al. 2009; Koye et al. 2018;
United States Renal Data System 2018). Patients with CKD are at increased risk of kidney
failure, potentially requiring treatment by kidney transplant or dialysis. Yet, there is substantial
heterogeneity in the development of kidney disease for patients with diabetes (Gheith et al.
2016).
Diabetic kidney disease (DKD) progression is typically characterized as the estimated
glomerular filtration rate (eGFR) trajectory over time (de Boer et al. 2009; Robinson-Cohen et al.
2014). Linear mixed effect models are often used to estimate change in eGFR; however,
characterizing the trajectory is complicated by observing sparse or irregularly spaced time series
data which may exhibit nonlinear trends as depicted in Supplementary Figure 1 within the
Supplementary Materials. Thus, implementing flexible statistical methods that characterize
eGFR trajectories in key subgroups could offer insights into DKD progression and treatment.
We propose the functional principal components analysis (FPCA) approach to model
long-term trajectories while accounting for complexity in curve estimation, i.e., nonlinearity,
sparsity, and irregularity. The subject of functional data analysis and the development of FPCA
in particular is well studied (Ramsay and Silverman 2005; Wang et al. 2016) with applications of
FPCA to sparse functional data dating back several decades (James et al. 2000; Paul and Peng
2009; Peng and Paul 2009; Rice and Wu 2001; Shi et al. 1996; Staniswalis and Lee 1998; Yao et
al. 2005; Yao and Lee 2006). FPCA has also been applied to study various disease progression
studies, e.g., lung, HIV (Szczesniak et al. 2017; Xie et al. 2017; Yan et al. 2017).
4
A salient feature of FPCA is that this approach can project infinite-dimensional curves
into finite-dimensional vector scores and detect major modes of curve variation which would
elucidate the leading patterns in disease (i.e., DKD) progression. Important for our context,
expanding the FPCA framework to characterizing and predicting DKD progression by clinically
meaningful subgroups will further elucidate the heterogeneous trends in the development of
kidney disease across patients with diabetes. In this work, we will consider subgroups based on
albuminuria, an established clinical biomarker for kidney disease based on the excess amount of
albumin in the urine often used in conjunction with eGFR to classify patient CKD stage (Kidney
Disease: Improving Global Outcomes (KDIGO) CKD Work Group 2013), although our
approach could readily be applied to any other subgroups of interest. A key question in model
development is whether a single full cohort FPCA model, trained using data from all patients
with diabetes irrespective of albuminuria groups, is sufficient for characterizing and accurately
predicting the long-term eGFR trajectory within specific albuminuria groups. If not, we consider
multiple albuminuria group-specific models, each fitted using data from only patients of a
specific albuminuria group, to prospectively predict eGFR trajectories for new subjects of the
same group. To decide whether this group-level approach is needed, we examine differences in
longitudinal eGFR patterns, i.e., mean and correlation functions, between albuminuria subgroups
as well as model fit, via our proposed goodness-of-fit procedure, and future prediction accuracy.
As a whole, our work applies and extends FPCA methodology to improving inference and
prediction of trajectories of kidney function decline in heterogeneous subpopulations of patients
with diabetes, with the methods generalizable to research applications beyond DKD.
The structure of this paper is as follows. Section 2 presents our methods. This describes
the FPCA approach, inferential permutation tests to test for differences in longitudinal eGFR
5
patterns between groups, comparison of model fits and accuracy, our CRIC study cohort,
assignment of albuminuria groups, and computational tools. Section 3 describes our statistical
analysis results in detail. Lastly, Section 4 discusses the findings, limitations, and future
directions of our work.
2 Methods
2.1 Functional Principal Components Analysis (FPCA)
Let    be the observed outcome at time , where is the
measurement-error free outcome for subject at time  and  are measurement errors assumed
to be identically and independently distributed normal with mean zero and variance such that
  and   .
We model the individual trajectories as a smooth random function  with unknown
mean function  and covariance function , where   and is a bounded and
closed time interval. Let be the outcome trajectory for the th individual and be years of
follow-up. Under the Karhunen-Loeve expansion (Karhunen 1946; Loève 1946), the th
individual’s trajectory can be expressed as 

where is the th functional principal component (FPC) and  is the associated th FPC
score for the th individual. The individual scores  are uncorrelated random variables with
mean zero and variance , where    and  
. The covariance function
 can be defined  

We briefly recapitulate the workflow and software implementation of the PACE algorithm by
Yao et al. (2005) to estimate these model components in the Supplementary Materials. Since the
outcome trajectory is often well approximated by the top FPCs and their associated scores, we
select as the number of FPCs that explained at least 95% of the total variance in the outcome
摘要:

1InferenceandPredictionUsingFunctionalPrincipalComponentsAnalysis:ApplicationtoDiabeticKidneyDiseaseProgressionintheChronicRenalInsufficiencyCohort(CRIC)StudyBrianKwan1,2,a,WeiYang3,DanielMontemayor4,5,JingZhang2,TobiasFuhrer6,b,AmandaH.Anderson7,CherylA.M.Anderson8,JingChen9,AnaC.Ricardo10,SylviaE....

展开>> 收起<<
1 Inference and Prediction Using Functional Principal Components Analysis Application to Diabetic Kidney Disease Progression in the Chronic Renal Insufficiency Cohort CRIC Study.pdf

共34页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:34 页 大小:850.42KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 34
客服
关注