focused on the partial functional partial regression model in ultra-high dimensions
with a diverging number of scalar predictors. All of the above three methods consist
of three steps, representing the functional predictors by using their leading functional
principal components (FPCs), reducing PFLM to a standard high dimensional linear
regression model, and selecting important features through the smoothly clipped ab-
solute deviation (SCAD) penalty (Fan and Li, 2001). Therefore, existing approaches
rely heavily on the success of the FPCA approach (Wang et al., 2016).
In this paper, we focus on the high dimensional PFLM (1), develop estimation
method for model selection and estimation, investigate theoretical properties of both
the functional and scalar estimators, and apply the proposed method to analyze the
ADNI dataset. We use the RKHS framework (Yuan and Cai, 2010; Cai and Yuan,
2012; Li and Zhu, 2020) and impose the roughness penalty on the functional coeffi-
cient. The success of the existing FPCA-based methods relies on the availability of
a good estimate of the functional principal components for the functional parameter,
and may not be appropriate if the functional parameter cannot be represented effec-
tively by the leading principals of the functional covariates (Yuan and Cai, 2010). On
the other hand, the truncation parameter in the FPCA changes in a discrete manner,
which may yield an imprecise control on the model complexity, as pointed out in
Ramsay and Silverman (2005). Furthermore, we impose the `0penalty on the scalar
predictors due to the fact that the `0penalty function is usually a desired choice
among the penalty functions as it directly penalizes the cardinality of a model and
seeks the most parsimonious model explaining the data. However, it is nonconvex
and the solving of an exact `0-penalized nonconvex optimization problem involves
exhaustive combinatorial best subset search, which is NP-hard and computationally
challenging (Zhao et al., 2019). We modify the computational algorithm in Huang
et al. (2018) to deal with the above difficulty and to accommodate the functional
predictor. Specifically, we proceed in three steps: (i) profiling out the functional
part by using the Representer theorem; (ii) simultaneously identifying the important
features and obtaining scalar estimates; and (iii) plugging the scalar estimates into
the loss function to derive the functional estimate. Meanwhile, we adapt the test
statistic in Li and Zhu (2020) to test the significant of the functional variable. The
implementation R code with its documentation is available as an online supplement.
Numerically, the proposed method is tested carefully on the simulated data. We
also provide theoretical properties of the estimators, including the error bounds of,
the asymptotic normality of the estimates of the nonzero scalar coefficients, and the
null limit distribution of the test statistic designed to test the nullity of the functional
variable. We apply PFLM to the ADNI dataset and carry out a throughout associa-
tion analysis between genetics, hippocampus and cognitive deficit. Different from the
existing analysis targeted to one or several cognitive measures, the proposed method
examines the joint effects of genetics and hippocampus on 13 cognitive variables ob-
served at 12 months after baseline measurements, that measure different aspects of
the cognitive function, and explore the shared and different heritablity patterns of the
13 cognitive scores. We also investigate the effect of the baseline diagnoiss information
on future cognitive outcome, denoted by the yellow arrow in Fig 1. Analysis results
suggest that both the hippocampal and genetic data have heterogeneous effects on
5