Eigenvalue tests for the number of latent factors in short panels

2025-08-25 3 0 1.06MB 65 页 10玖币

侵权投诉

Eigenvalue tests for the number of latent factors

in short panels∗

Alain-Philippe Fortin†

, Patrick Gagliardini‡

, Olivier Scaillet§

First Version: June 2022. This Version: October 2022

Abstract

This paper studies new tests for the number of latent factors in a large cross-sectional

factor model with small time dimension. These tests are based on the eigenvalues of

variance-covariance matrices of (possibly weighted) asset returns, and rely on either an

assumption of spherical errors, or instrumental variables for factor betas. We establish

the asymptotic distributional results using expansion theorems based on perturbation

theory for symmetric matrices. Our framework accommodates semi-strong factors in

the systematic components. We propose a novel statistical test for weak factors against

strong or semi-strong factors. We provide an empirical application to US equity data.

Evidence for a diﬀerent number of latent factors according to market downturns and

market upturns, is statistically ambiguous in the considered subperiods. In particular,

our results contradicts the common wisdom of a single factor model in bear markets.

∗This paper underlies the Halbert White Jr. Memorial JFEC invited lecture given by Patrick Gagliardini

at the Annual Society for Financial Econometrics Conference on June 25th 2022 at the University of Cam-

bridge. We thank the JFEC Editors Allan Timmermann and Fabio Trojani for the invitation, the discussants

Alexei Onatski and Markus Pelger for very insightful and constructive comments, as well as G. Genoni, L.

Mancini and participants at the Annual SoFiE conference 2022 and at seminars at the Universities of Geneva

and Warwick for helpful remarks.

†University of Geneva and Swiss Finance Institute.

‡Università della Svizzera italiana (USI, Lugano) and Swiss Finance Institute. E-mail address:

patrick.gagliardini@usi.ch.

§University of Geneva and Swiss Finance Institute.

arXiv:2210.16042v1 [econ.EM] 28 Oct 2022

1 Introduction

A central and practical issue in applied work with unobservable (i.e. latent) factors is

to determine the number of factors. For models with latent factors only, Connor and

Korajczyk (1993) are the ﬁrst to develop a statistical test for the number of factors

for large balanced panels of individual stock returns in time-invariant models under

covariance stationarity and homoskedasticity. Unobservable factors are estimated by

the method of asymptotic principal components developed by Connor and Korajczyk

(1986) (see also Stock and Watson (2002)). For heteroskedastic settings, the recent lit-

erature on large balanced panels with static factors has extended the toolkit available

to researchers. A ﬁrst strand of that literature focuses on consistent estimation proce-

dures for the number of factors. Bai and Ng (2002) introduce a penalized least-squares

strategy to estimate the number of factors, at least one. Ando and Bai (2015) extend

that approach when explanatory variables are present in the linear speciﬁcation (see Bai

(2009) for homogeneous regression coeﬃcients). Onatski (2010) looks at the behavior

of the adjacent eigenvalues to determine the number of factors when the cross-sectional

dimension (n) and the time-series dimension (T) are both large and comparable. Ahn

and Horenstein (2013) opt for the same strategy and cover the possibility of zero fac-

tors through specifying a mock eigenvalue whose functional form vanishes too. Caner

and Han (2014) propose an estimator with a group bridge penalization to determine

the number of unobservable factors. Based on the framework of Gagliardini, Ossola

and Scaillet (2016), Gagliardini, Ossola and Scaillet (2019) build a simple diagnostic

criterion for approximate factor structures in large panel datasets. Given observable

factors, the criterion checks whether the errors are weakly cross-sectionally correlated

or share at least one unobservable common factor (interactive eﬀects). A general ver-

sion allows to determine the number of omitted common factors also for time-varying

structures (see Gagliardini, Ossola and Scaillet (2020) for a survey of estimation of large

dimensional conditional factor models in ﬁnance). A second strand of that literature

develops inference procedures for hypotheses on the number of latent factors. Onatski

(2009) deploys a characterization of the largest eigenvalues of a Wishart-distributed

covariance matrix with large dimensions in terms of the Tracy-Widom Law. To get a

Wishart distribution, Onatski (2009) assumes either Gaussian errors or Tmuch larger

than n. Kapetanios (2010) uses subsampling to estimate the limit distribution of the

adjacent eigenvalues.

This paper aims at complementing the above literature by considering a large cross-

sectional dimension but a ﬁxed time series dimension, i.e., a short panel. We develop

new tests for the number of latent factors with statistics based on the eigenvalues,

and spacings thereof, of variance-covariance matrices. The key idea is that, under

assumptions on the error terms detailed in the paper, the eigenvalues of some ﬁnite-

dimensional variance-covariance matrices constructed from returns feature a ﬂat pattern

(possibly equal to zero) for orders larger than kwhen ranked in decreasing order,

where kis the number of latent factors. By establishing the asymptotic distributions

of the small eigenvalues of estimated variance-covariance matrices we develop testing

procedures on the number of latent factors k.

In a short panel setting, Zaﬀaroni (2019) considers a methodology for inference on

conditional asset pricing models linear in latent risk factors, valid when the number of

assets diverges but the time series dimension is ﬁxed, possibly very small. He shows

that the no-arbitrage condition permits to identify the risk premia as the expectation

of the latent risk factors. This result paves the way to an inferential procedure for

the factor risk premia and for the stochastic discount factor, spanned by the latent

risk factors. Raponi, Robotti and Zaﬀaroni (2020) has recently developed tests of

beta-pricing models and a two-pass methodology to estimate the ex-post risk premia

(Shanken (1992)) associated to observable factors. Kim and Skoulakis (2018) deals with

the error-in-variable problem of the two-pass methodology with small Tby regression-

calibration. The small Tperspective yields an eﬀective approach to capture general

forms of time-variation in factor betas, risk premia and number of factors by performing

the factor analysis in short subperiods (either non-overlapping, or rolling windows) of

the sample of interest.

The recent literature has extended asymptotic principal component methods to

accommodate more general factor models. Fan, Liao and Wang (2016) extend the

characteristic-based modeling in Connor and Linton (2007) and Connor, Hagmann and

Linton (2012) by allowing the betas to include unknown asset-speciﬁc additive con-

stants. They propose a so-called Projected Principal Component Analysis to estimate

this speciﬁcation with time-invariant loadings, and show that their factor estimates are

consistent even if Tis ﬁnite. Pelger and Xiong (2022) instead let the factor loadings be

functions of an observable state variable. Their estimation under large nand Trelies

on minimizing a local version of the least-squares criterion underlying PCA, where lo-

calization is implemented by kernel smoothing. Gu, Kelly and Xiu (2021) consider the

setting where the loadings are a nonparametric function of a large dimensional vector

of characteristics, and use an autoencoder to estimate this relationship. Among the

parametric approaches, Kelly, Pruitt and Su (2017, 2019) model the coeﬃcients as lin-

ear functions of characteristics plus some noise term, while Chen, Roussanov and Wang

(2022) opt for semi-nonparametric nonlinear modeling of nonlinear betas. Gagliardini

and Ma (2019) study the problem of conducting inference on the conditional factor

space, including its dimension. The adopted nonparametric framework is general re-

garding the beta dynamics and encompasses the aforementioned linear and nonlinear

beta speciﬁcations. Finally, let us mention that there is also work on inference for large

dimensional models with unobservable factors with high frequency data (Ait-Sahalia

and Xiu (2017), Pelger (2019, 2020), Cheng, Liao and Yang (2021)). None of these

papers considers the problem of testing for the number of latent factors.

The outline of the paper is as follows. In Section 2, we present the static factor model

for asset (excess) returns, and discuss identiﬁcation either via instrumental variables,

or a sphericity assumption for the variance-covariance matrix of returns. We study the

(in)consistency of the PCA factor estimator, as well as interpretation in terms of Error-

in-Variable and in terms of incidental parameters. Section 3 develops the eigenvalue

test statistics based on instrumental variables and based on eigenvalues of the return

variance-covariance. Section 4 characterizes the asymptotic distributions of the test

statistics. To do so under large nand ﬁxed T, we establish a new second-order uniform

asymptotic expansion of the small eigenvalues of a symmetric matrix via perturbation

theory. We indicate how to achieve feasible statistics by providing adequate estimators

of the characteristics of the asymptotic distribution. We dedicate Section 5 to extending

our analysis to cover inference within a more general framework including weak factors.

We analyze testing for (semi-)strong factors vs vanishing factors, power under local

alternative hypotheses, and testing for weak factors. In Section 6, we provide the

results of Monte Carlo experiments to investigate the ﬁnite-sample properties of the

considered test statistics. Section 7 presents the ﬁndings of our empirical analysis in

short subpanels of stock returns in the US market. The concluding remarks are given

in Section 8.

2 An eigenvalue testing problem

We develop our inferential theory for the number of latent factors under a static model:

yi,t =β0

ift+εi,t,(1)

where i= 1, ..., n is the index for “individuals" (e.g., assets) and t= 1, ..., T for time

periods (e.g., months), ftis a k-dimensional vector of unobservable factors and εi,t is

the idiosyncratic error term. We introduce below some high-level conditions on latent

factors and error terms underlying our analysis, while we refrain from detailing the

speciﬁc regularity conditions. 1In asset pricing applications, variables yi,t denote asset

(excess) returns and the components of vector ftrepresent pervasive risk factors in

the economy. We assume that the time series dimension Tis ﬁxed, i.e., we face short

panels, while the cross-sectional dimension ntends to inﬁnity in our asymptotics. We

rewrite the model in matrix notation as:

yi=F βi+εi,(2)

where yiand εiare T×1vectors and Fis a T×kmatrix. We work conditionally on a

given realization of the factor path, i.e., we treat Fas an unknown matrix parameter.

Our focus is on inference on the number of latent factors k.

In this section, we develop the framework with strong factors, namely matrix Σβ:=

lim

n→∞

nPn

i=1 βiβ0

iis positive deﬁnite. We consider the setting with semi-strong and

weak factors in a later section. Next, we present two approaches for identiﬁcation of

the unknown number kof factors.

2.1 Identiﬁcation by instrumental variables

We start by assuming the existence of an overidentiﬁed vector of instrumental variables.

In a large Tframework, identiﬁcation with instrumental variables is considered by

1Those are given in Fortin, Gagliardini, Scaillet (2022a) for the approach based on factor analysis, which

generalizes the approach based on the PCA of variance-covariance matrix of returns considered in this paper.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Eigenvaluetestsforthenumberoflatentfactorsinshortpanels*Alain-PhilippeFortin,PatrickGagliardini,OlivierScailletFirstVersion:June2022.ThisVersion:October2022AbstractThispaperstudiesnewtestsforthenumberoflatentfactorsinalargecross-sectionalfactormodelwithsmalltimedimension.Thesetestsarebasedontheei...

展开>> 收起<<

Eigenvalue tests for the number of latent factors in short panels.pdf

共65页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Eigenvalue tests for the number of latent factors in short panels

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: