Eigenvalue tests for the number of latent factors in short panels

2025-08-25 0 0 1.06MB 65 页 10玖币
侵权投诉
Eigenvalue tests for the number of latent factors
in short panels
Alain-Philippe Fortin
, Patrick Gagliardini
, Olivier Scaillet§
First Version: June 2022. This Version: October 2022
Abstract
This paper studies new tests for the number of latent factors in a large cross-sectional
factor model with small time dimension. These tests are based on the eigenvalues of
variance-covariance matrices of (possibly weighted) asset returns, and rely on either an
assumption of spherical errors, or instrumental variables for factor betas. We establish
the asymptotic distributional results using expansion theorems based on perturbation
theory for symmetric matrices. Our framework accommodates semi-strong factors in
the systematic components. We propose a novel statistical test for weak factors against
strong or semi-strong factors. We provide an empirical application to US equity data.
Evidence for a different number of latent factors according to market downturns and
market upturns, is statistically ambiguous in the considered subperiods. In particular,
our results contradicts the common wisdom of a single factor model in bear markets.
This paper underlies the Halbert White Jr. Memorial JFEC invited lecture given by Patrick Gagliardini
at the Annual Society for Financial Econometrics Conference on June 25th 2022 at the University of Cam-
bridge. We thank the JFEC Editors Allan Timmermann and Fabio Trojani for the invitation, the discussants
Alexei Onatski and Markus Pelger for very insightful and constructive comments, as well as G. Genoni, L.
Mancini and participants at the Annual SoFiE conference 2022 and at seminars at the Universities of Geneva
and Warwick for helpful remarks.
University of Geneva and Swiss Finance Institute.
Università della Svizzera italiana (USI, Lugano) and Swiss Finance Institute. E-mail address:
patrick.gagliardini@usi.ch.
§University of Geneva and Swiss Finance Institute.
1
arXiv:2210.16042v1 [econ.EM] 28 Oct 2022
1 Introduction
A central and practical issue in applied work with unobservable (i.e. latent) factors is
to determine the number of factors. For models with latent factors only, Connor and
Korajczyk (1993) are the first to develop a statistical test for the number of factors
for large balanced panels of individual stock returns in time-invariant models under
covariance stationarity and homoskedasticity. Unobservable factors are estimated by
the method of asymptotic principal components developed by Connor and Korajczyk
(1986) (see also Stock and Watson (2002)). For heteroskedastic settings, the recent lit-
erature on large balanced panels with static factors has extended the toolkit available
to researchers. A first strand of that literature focuses on consistent estimation proce-
dures for the number of factors. Bai and Ng (2002) introduce a penalized least-squares
strategy to estimate the number of factors, at least one. Ando and Bai (2015) extend
that approach when explanatory variables are present in the linear specification (see Bai
(2009) for homogeneous regression coefficients). Onatski (2010) looks at the behavior
of the adjacent eigenvalues to determine the number of factors when the cross-sectional
dimension (n) and the time-series dimension (T) are both large and comparable. Ahn
and Horenstein (2013) opt for the same strategy and cover the possibility of zero fac-
tors through specifying a mock eigenvalue whose functional form vanishes too. Caner
and Han (2014) propose an estimator with a group bridge penalization to determine
the number of unobservable factors. Based on the framework of Gagliardini, Ossola
and Scaillet (2016), Gagliardini, Ossola and Scaillet (2019) build a simple diagnostic
criterion for approximate factor structures in large panel datasets. Given observable
factors, the criterion checks whether the errors are weakly cross-sectionally correlated
or share at least one unobservable common factor (interactive effects). A general ver-
sion allows to determine the number of omitted common factors also for time-varying
structures (see Gagliardini, Ossola and Scaillet (2020) for a survey of estimation of large
dimensional conditional factor models in finance). A second strand of that literature
develops inference procedures for hypotheses on the number of latent factors. Onatski
(2009) deploys a characterization of the largest eigenvalues of a Wishart-distributed
covariance matrix with large dimensions in terms of the Tracy-Widom Law. To get a
Wishart distribution, Onatski (2009) assumes either Gaussian errors or Tmuch larger
2
than n. Kapetanios (2010) uses subsampling to estimate the limit distribution of the
adjacent eigenvalues.
This paper aims at complementing the above literature by considering a large cross-
sectional dimension but a fixed time series dimension, i.e., a short panel. We develop
new tests for the number of latent factors with statistics based on the eigenvalues,
and spacings thereof, of variance-covariance matrices. The key idea is that, under
assumptions on the error terms detailed in the paper, the eigenvalues of some finite-
dimensional variance-covariance matrices constructed from returns feature a flat pattern
(possibly equal to zero) for orders larger than kwhen ranked in decreasing order,
where kis the number of latent factors. By establishing the asymptotic distributions
of the small eigenvalues of estimated variance-covariance matrices we develop testing
procedures on the number of latent factors k.
In a short panel setting, Zaffaroni (2019) considers a methodology for inference on
conditional asset pricing models linear in latent risk factors, valid when the number of
assets diverges but the time series dimension is fixed, possibly very small. He shows
that the no-arbitrage condition permits to identify the risk premia as the expectation
of the latent risk factors. This result paves the way to an inferential procedure for
the factor risk premia and for the stochastic discount factor, spanned by the latent
risk factors. Raponi, Robotti and Zaffaroni (2020) has recently developed tests of
beta-pricing models and a two-pass methodology to estimate the ex-post risk premia
(Shanken (1992)) associated to observable factors. Kim and Skoulakis (2018) deals with
the error-in-variable problem of the two-pass methodology with small Tby regression-
calibration. The small Tperspective yields an effective approach to capture general
forms of time-variation in factor betas, risk premia and number of factors by performing
the factor analysis in short subperiods (either non-overlapping, or rolling windows) of
the sample of interest.
The recent literature has extended asymptotic principal component methods to
accommodate more general factor models. Fan, Liao and Wang (2016) extend the
characteristic-based modeling in Connor and Linton (2007) and Connor, Hagmann and
Linton (2012) by allowing the betas to include unknown asset-specific additive con-
stants. They propose a so-called Projected Principal Component Analysis to estimate
this specification with time-invariant loadings, and show that their factor estimates are
3
consistent even if Tis finite. Pelger and Xiong (2022) instead let the factor loadings be
functions of an observable state variable. Their estimation under large nand Trelies
on minimizing a local version of the least-squares criterion underlying PCA, where lo-
calization is implemented by kernel smoothing. Gu, Kelly and Xiu (2021) consider the
setting where the loadings are a nonparametric function of a large dimensional vector
of characteristics, and use an autoencoder to estimate this relationship. Among the
parametric approaches, Kelly, Pruitt and Su (2017, 2019) model the coefficients as lin-
ear functions of characteristics plus some noise term, while Chen, Roussanov and Wang
(2022) opt for semi-nonparametric nonlinear modeling of nonlinear betas. Gagliardini
and Ma (2019) study the problem of conducting inference on the conditional factor
space, including its dimension. The adopted nonparametric framework is general re-
garding the beta dynamics and encompasses the aforementioned linear and nonlinear
beta specifications. Finally, let us mention that there is also work on inference for large
dimensional models with unobservable factors with high frequency data (Ait-Sahalia
and Xiu (2017), Pelger (2019, 2020), Cheng, Liao and Yang (2021)). None of these
papers considers the problem of testing for the number of latent factors.
The outline of the paper is as follows. In Section 2, we present the static factor model
for asset (excess) returns, and discuss identification either via instrumental variables,
or a sphericity assumption for the variance-covariance matrix of returns. We study the
(in)consistency of the PCA factor estimator, as well as interpretation in terms of Error-
in-Variable and in terms of incidental parameters. Section 3 develops the eigenvalue
test statistics based on instrumental variables and based on eigenvalues of the return
variance-covariance. Section 4 characterizes the asymptotic distributions of the test
statistics. To do so under large nand fixed T, we establish a new second-order uniform
asymptotic expansion of the small eigenvalues of a symmetric matrix via perturbation
theory. We indicate how to achieve feasible statistics by providing adequate estimators
of the characteristics of the asymptotic distribution. We dedicate Section 5 to extending
our analysis to cover inference within a more general framework including weak factors.
We analyze testing for (semi-)strong factors vs vanishing factors, power under local
alternative hypotheses, and testing for weak factors. In Section 6, we provide the
results of Monte Carlo experiments to investigate the finite-sample properties of the
considered test statistics. Section 7 presents the findings of our empirical analysis in
4
short subpanels of stock returns in the US market. The concluding remarks are given
in Section 8.
2 An eigenvalue testing problem
We develop our inferential theory for the number of latent factors under a static model:
yi,t =β0
ift+εi,t,(1)
where i= 1, ..., n is the index for “individuals" (e.g., assets) and t= 1, ..., T for time
periods (e.g., months), ftis a k-dimensional vector of unobservable factors and εi,t is
the idiosyncratic error term. We introduce below some high-level conditions on latent
factors and error terms underlying our analysis, while we refrain from detailing the
specific regularity conditions. 1In asset pricing applications, variables yi,t denote asset
(excess) returns and the components of vector ftrepresent pervasive risk factors in
the economy. We assume that the time series dimension Tis fixed, i.e., we face short
panels, while the cross-sectional dimension ntends to infinity in our asymptotics. We
rewrite the model in matrix notation as:
yi=F βi+εi,(2)
where yiand εiare T×1vectors and Fis a T×kmatrix. We work conditionally on a
given realization of the factor path, i.e., we treat Fas an unknown matrix parameter.
Our focus is on inference on the number of latent factors k.
In this section, we develop the framework with strong factors, namely matrix Σβ:=
lim
n→∞
1
nPn
i=1 βiβ0
iis positive definite. We consider the setting with semi-strong and
weak factors in a later section. Next, we present two approaches for identification of
the unknown number kof factors.
2.1 Identification by instrumental variables
We start by assuming the existence of an overidentified vector of instrumental variables.
In a large Tframework, identification with instrumental variables is considered by
1Those are given in Fortin, Gagliardini, Scaillet (2022a) for the approach based on factor analysis, which
generalizes the approach based on the PCA of variance-covariance matrix of returns considered in this paper.
5
摘要:

Eigenvaluetestsforthenumberoflatentfactorsinshortpanels*Alain-PhilippeFortin„,PatrickGagliardini…,OlivierScailletŸFirstVersion:June2022.ThisVersion:October2022AbstractThispaperstudiesnewtestsforthenumberoflatentfactorsinalargecross-sectionalfactormodelwithsmalltimedimension.Thesetestsarebasedontheei...

展开>> 收起<<
Eigenvalue tests for the number of latent factors in short panels.pdf

共65页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:65 页 大小:1.06MB 格式:PDF 时间:2025-08-25

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 65
客服
关注