R-NL Covariance Matrix Estimation for Elliptical Distributions based on Nonlinear Shrinkage Simon HedigeraJerey N afbMichael Wolfa

2025-05-04 0 0 1.63MB 33 页 10玖币
侵权投诉
R-NL: Covariance Matrix Estimation for Elliptical
Distributions based on Nonlinear Shrinkage
Simon HedigeraJeffrey N¨afbMichael Wolfa
aDepartment of Economics, University of Zurich, Switzerland
bPreMeDICaL, Inria-Inserm, Montpellier, France
May 4, 2023
Abstract
We combine Tyler’s robust estimator of the dispersion matrix with nonlinear
shrinkage. This approach delivers a simple and fast estimator of the dispersion matrix
in elliptical models that is robust against both heavy tails and high dimensions. We
prove convergence of the iterative part of our algorithm and demonstrate the favorable
performance of the estimator in a wide range of simulation scenarios. Finally, an
empirical application demonstrates its state-of-the-art performance on real data.
Keywords: Heavy Tails, Nonlinear Shrinkage, Portfolio Optimization
Corresponding author at: PreMeDICaL, Inria-Inserm, Montpellier, France.
E-mail address: jeffrey.naf@inria.fr.
arXiv:2210.14854v4 [stat.ME] 3 May 2023
1 Introduction
Many statistical applications rely on covariance matrix estimation. Two common chal-
lenges are (1) the presence of heavy tails and (2) the high-dimensional nature of the data.
Both problems lead to suboptimal performance or even inconsistency of the usual sam-
ple covariance estimator ˆ
S. Consequently, there is a vast literature on addressing these
problems.
Two prominent ways to address (1) are (Maronna’s) M-estimators of scatter (Kent and
Tyler (1991)), as well as truncation of the sample covariance matrix; for example, see Ke
et al. (2019). There also appear to be two main approaches to solving problem (2). The
first is to assume a specific structure on the covariance matrix to reduce the number of
parameters. One example of this is the “spiked covariance model”, as explored e.g., in
Johnstone (2001); Johnstone and Lu (2009); Donoho et al. (2018), a second is to assume
(approximate) sparsity and to use thresholding estimators (Bickel and Levina (2008a,b);
Rothman et al. (2009); Cai and Liu (2011)). We also refer to Ke et al. (2019) who present
a range of general estimators under heavy tails and extend to the case nąp, by assuming
specific structures on the covariance matrix. If one is not willing to assume such structure,
a second approach is to leave the eigenvectors of the sample covariance matrix unchanged
and to only adapt the eigenvalues. This leads to the class of estimators of Stein (1975,
1986). Linear shrinkage (Ledoit and Wolf (2004)) as well as nonlinear shrinkage developed
in Ledoit and Wolf (2012,2015,2020,2022b) are part of this class.
One promising line of research to address both problems at once is to extend (Maronna’s)
M-estimators of scatter (Kent and Tyler,1991) with a form of shrinkage for high dimen-
sions. This approach is in particular popular with a specific example of M-estimators called
“Tyler’s estimator” (Tyler (1987a)), which is derived in the context of elliptical distribu-
tions. Several papers have studied this approach, using a convex combination of the base
estimator and a target matrix, usually the (scaled) identity matrix. We generally refer
to such approaches as robust linear shrinkage estimators. For instance, Ollila and Tyler
(2014); Auguin et al. (2016); Ollila et al. (2021); Ashurbekova et al. (2021) combine the
linear shrinkage with Maronna’s M-estimators, whereas Abramovich and Spencer (2007);
Chen et al. (2011); Yang et al. (2014); Zhang and Wiesel (2016) do so with Tyler’s esti-
mator. Since this approach of combining linear shrinkage with a robust estimator entails
choosing a hyperparameter determining the amount of shrinkage, the second step often
consists of deriving some (asymptotically) optimal parameter that then can be estimated
from data. The approach results in estimation methods that are generally computation-
ally inexpensive and it also enables strong theoretical results on the convergence of the
underlying iterative algorithms.
Despite these advantages, several problems remain. First, the performance of these
robust methods sometimes does not exceed the performance of the basic linear shrinkage
estimator of Ledoit and Wolf (2004) in heavy-tailed models, except for small sample sizes
n(say nă100). In fact, the theoretical analysis of Couillet and McKay (2014); Auguin
et al. (2016) shows that robust M-estimators using linear shrinkage are asymptotically
equivalent to scaled versions of the linear shrinkage estimator of Ledoit and Wolf (2004).
Depending on how the data-adaptive hyperparameter is chosen, the performance can even
deteriorate quickly as the tails get lighter, as we demonstrate in our simulation study in
Section 4. Second, some robust methods cannot handle the case when the dimension pis
2
larger than the sample size n, such as Ollila et al. (2021). Third, some methods propose a
choice of hyperparameter(s) through cross-validation, such as Yu et al. (2017); Yi and Tyler
(2021), which can be computationally expensive. In this paper, we address these problems
by developing a simple algorithm based on nonlinear shrinkage (Ledoit and Wolf (2012,
2015,2020,2022b)), inspired by the above robust approaches and the work of Hediger and
af (2022). In essence, the algorithm applies the quadratic inverse shrinkage (QIS) method
of Ledoit and Wolf (2022b) to appropriately standardized data, thereby greatly increasing
its finite-sample performance in heavy-tailed models. Thus, we refer to the new method
as “Robust Nonlinear Shrinkage” (R-NL); in particular, we extend the proposal of Hediger
and N¨af (2022) from a parametric model to general elliptical distributions. This approach
includes an iteration over the space of orthogonal matrices, which we prove converges to a
stationary point. We motivate our approach using properties of elliptical distributions along
the lines of Chen et al. (2011); Zhang and Wiesel (2016); Ashurbekova et al. (2021) and
demonstrate the favorable performance of our method in a wide range of settings. Notably,
our approach (i) greatly improves the performance of (standard) nonlinear shrinkage in
heavy-tailed settings; does not deteriorate when moving from heavy to Gaussian tails; (iii)
can handle the case pąn; and (iv) does not require the choice of a tuning parameter.
The remainder of the article is organized as follows. Section 1.1 lists our contribu-
tions. Section 2presents an example to motivate our methodology. Section 3describes
the proposed new methodology and provides results concerning the convergence of the new
algorithm. Section 4showcases the performance of our method in a simulation study using
various settings for both pănand pąn. Section 5applies our method to financial data,
illustrating the performance of the method on real data.
1.1 Contributions
To the best of our knowledge, no paper has so far attempted to combine nonlinear shrinkage
of Ledoit and Wolf (2012,2015,2020,2022b) with Tyler’s method. As such, our approach
differs markedly from previous ones. It is partly based on an M-estimator interpretation,
but also adds the nonparametric nonlinear shrinkage approach. A downside of this approach
is that theoretical convergence results are harder to come by. Nonetheless, we are able to
show that the iterative part of our algorithm converges to a stationary point, a crucial
result for the practical usefulness of the algorithm.
Maybe the closest paper to our method is Breloy et al. (2019), where the eigenvalues of
Tyler’s estimator are iteratively shrunken towards predetermined target eigenvalues, with a
parameter αdetermining the shrinkage strength. Through different objectives, they arrive
at an algorithm from which the iterative part of our Algorithm 2can be recovered when
setting α“ 8. Additionally, using the eigenvalues from nonlinear shrinkage as the target
eigenvalues, their method presents an alternative way of combining Tyler’s estimator with
nonlinear shrinkage. Though they did not originally propose this, this was suggested by
an anonymous reviewer. However, while there is an overlap in the two algorithms for the
corner case of α“ 8, they arrive at their Algorithm 1 from a different angle than we do.
Consequently, their theoretical results cannot be applied in our analysis. Moreover, they
do not suggest how to choose the tuning parameter α. In Appendix A, simulations indicate
that when the target eigenvalues are obtained from nonlinear shrinkage, setting α“ 8, and
thus maximally shrinking towards the nonlinear shrinkage eigenvalues, is usually beneficial.
3
Table 1: Notation
Symbol Description
nSample size
pDimensionality
Σ.
.VarpYqThe covariance matrix of the random vector Y.
TrpAqTrace of a square matrix A
}A}FFrobenius norm aTrpAJAqof a sqaure matrix A
Hdispersion matrix
Othe orthogonal group
O0equivalence class in O
Uarbitrary element of O
VEigenvectors of HVΛVJ
Vr`s`th iteration of the algorithm
ˆ
Vcritical point/solution/estimate
Vsubset of critical points of O
ΛTrue ordered eigenvalues of H, up to scaling
Λ0Initial (shrunken) estimate of Λ
ΛRFinal R-NL (shrunken) estimate of Λ
ˆ
ΛEigenvalues of F´ˆ
V¯
ˆ
Λr``1sEigenvalues of F`Vr`s˘
diagpq Transforms a vector aPRpinto an pˆpdiagonal matrix diagpaq
In addition, these simulations show that the updating of eigenvalues we propose after the
iterations converged can lead to an additional boost in performance over their method.
Whereas many of the aforementioned robust linear shrinkage papers have important
theoretical results, the empirical examination of their estimators in simulations and real
data applications is often limited. We attempt to give a more comprehensive empirical
overview in this paper. Contrary to most of the previous papers, we also consider a com-
paratively large sample size of n300 in our simulation study. Compared to 6 competing
methods, our new approach displays a superior performance over a wide range of scenarios.
We also provide a Matlab implementation of our method, as well as the code to replicate
all simulations on https://github.com/hedigers/RNL_Code.
2 Motivational Example
For a collection of nindependent and identically distributed (i.i.d.) random vectors with
values in Rp, let ˆ
V`ˆv1,...,ˆvp˘be the matrix of eigenvectors of the sample covariance
matrix ˆ
S. Nonlinear shrinkage, just as the linear shrinkage of Ledoit and Wolf (2004), only
changes the eigenvalues of the sample covariance matrix, while keeping the eigenvectors ˆ
V.
That is, nonlinear shrinkage is also in the class of estimators of the form ˆ
Vˆ
VJ, with ∆
4
diagonal, a class that goes back to Stein (1975,1986). It is well known that
arg min
∆ diagonal
}Σ´ˆ
Vˆ
VJ}Fdiag´`δ1. . . δN˘J¯
with δj:ˆvJ
jΣˆvj;
for example, see (Ledoit and Wolf,2022a, Section 3.1). Nonlinear shrinkage takes the
sample covariance matrix ˆ
Sas an input and outputs a shrunken estimate of Σof the form
ˆ
0ˆ
VJ, where Λ0diagpˆ
δ1,...,ˆ
δNqis a diagonal matrix. Although there are different
schemes to come with estimates tˆ
δju, each scheme uses as the only inputs p,n, and the
set of eigenvalues of ˆ
S. In this paper we derive a new estimator that is not in the class
of Stein (1975,1986) but applies nonlinear shrinkage to a transformation of the data. It
thereby implicitly uses more information than just the sample covariance matrix (together
with pand n). Since we focus in the following on the class of elliptical distributions, we will
differentiate between the dispersion matrix Hand the covariance matrix Σ. The former
will be defined in Section 3, but the main difference between the two population quantities
is that Σmight not exist. If it does exist, Σis simply given by cH, with cą0 depending
on the underlying distribution.
To illustrate the advantage of our method, we now present a motivational toy example
before moving on to the general methodology. We first consider a multivariate Gaussian
distribution in dimension p200 with mean µ0and covariance matrix ΣH, where
the pi, jqelement of His 0.7|i´j|, as in Chen et al. (2011). We simulate n300 i.i.d.
observations from this distribution. For j1, . . . , p, the left panel of Figure 1displays the
theoretical optimum δj,ˆvJ
jˆ
0ˆ
Vˆvj.
.ˆ
δj, as well as ˆvJ
jˆ
Hˆvj, where ˆ
His the proposed R-
NL estimator. Importantly, the estimated values are very close to the theoretical optimum
δj,j1, . . . , p, for both nonlinear shrinkage and our proposed method.
We next consider the same setting, but instead simulate from a multivariate tdistribu-
tion with 4 degrees of freedom and dispersion matrix H, such that the covariance matrix Σ
is 4{p4´2q ¨ H. In particular the ˆvJ
jˆ
Hˆvjare multiplied by c2 in this case to obtain an
estimate of ˆvJ
jΣˆvj. (The value c2 would not be known in practice but is ‘fair’ to use it
in this toy example, since doing so does not favor one estimation method over the other).
The left panel of Figure 1displays the results. It can be seen that nonlinear shrinkage
overestimates large values of δj(by a lot) and underestimates small values of δj; on the
other hand, our new method does not have this problem and its performance (almost)
matches the one from the Gaussian case.
3 Methodology
We assume to observe an i.i.d. sample Y.
.“ tY1,...,Ynufrom a p-dimensional elliptical
distribution. If Yhas an elliptical distribution it can be represented as
YD
µ`RH1{2ξ,(1)
where Ris a positive random variable, and ξis uniformly distributed on the p-dimensional
unit sphere, independently of R, and D
denotes equality in distribution (Cambanis et al.,
1981). The dispersion matrix His assumed to be symmetric positive-definite (pd), with
5
摘要:

R-NL:CovarianceMatrixEstimationforEllipticalDistributionsbasedonNonlinearShrinkageSimonHedigeraJe reyNafb*MichaelWolfaaDepartmentofEconomics,UniversityofZurich,SwitzerlandbPreMeDICaL,Inria-Inserm,Montpellier,FranceMay4,2023AbstractWecombineTyler'srobustestimatorofthedispersionmatrixwithnonlinearshr...

展开>> 收起<<
R-NL Covariance Matrix Estimation for Elliptical Distributions based on Nonlinear Shrinkage Simon HedigeraJerey N afbMichael Wolfa.pdf

共33页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:33 页 大小:1.63MB 格式:PDF 时间:2025-05-04

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 33
客服
关注