Robust Estimation and Inference in Panels with Interactive Fixed Effects Timothy B. ArmstrongMartin WeidnerAndrei Zeleneev

2025-04-24 0 0 859.71KB 63 页 10玖币

侵权投诉

Robust Estimation and Inference

in Panels with Interactive Fixed Eﬀects∗

Timothy B. Armstrong‡Martin Weidner§Andrei Zeleneev¶

December 2024

Abstract

We consider estimation and inference for a regression coeﬃcient in panels with interactive

ﬁxed eﬀects (i.e., with a factor structure). We demonstrate that existing estimators and

conﬁdence intervals (CIs) can be heavily biased and size-distorted when some of the factors

are weak. We propose estimators with improved rates of convergence and bias-aware CIs

that remain valid uniformly, regardless of factor strength. Our approach applies the theory

of minimax linear estimation to form a debiased estimate, using a nuclear norm bound on

the error of an initial estimate of the interactive ﬁxed eﬀects. Our resulting bias-aware CIs

take into account the remaining bias caused by weak factors. Monte Carlo experiments

show substantial improvements over conventional methods when factors are weak, with

minimal costs to estimation accuracy when factors are strong.

∗We thank the participants of the numerous seminars and conferences for helpful comments and sugges-

tions. We also thank Riccardo D’Adamo and Chen-Wei Hsiang for their excellent research assistance. Any

remaining errors are our own. Armstrong gratefully acknowledges support by the National Science Founda-

tion Grant SES-2049765. Weidner gratefully acknowledges support through the European Research Council

grant ERC-2018-CoG-819086-PANEDA. Zeleneev gratefully acknowledges the generous funding from the UK

Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee (Grant Ref:

EP/X02931X/1).

‡University of Southern California. Email: timothy.armstrong@usc.edu

§University of Oxford. Email: martin.weidner@economics.ox.ac.uk

¶University College London. Email: a.zeleneev@ucl.ac.uk

arXiv:2210.06639v3 [econ.EM] 11 Dec 2024

1 Introduction

In this paper, we consider a linear panel regression model of the form

Yit =Xitβ+

k=1

Zk,itδk+ Γit +Uit,(1)

where Yit, Xit, Z1,it, . . . , ZK,it ∈Rare the observed outcome variable and covariates for units

i= 1, . . . , N and time periods t= 1, . . . , T . The error components Γit ∈Rand Uit ∈Rare

unobserved, and the regression coeﬃcients β, δ1, . . . , δK∈Rare unknown. The parameter of

interest is β∈R, the coeﬃcient on Xit. We are interested in “large panels”, where both N

and Tare relatively large.

The error component Uit is modelled as a mean-zero random shock that is uncorrelated

with the regressors Xit and Zk,it and that is at most weakly autocorrelated across iand over

t. By contrast, the error component Γit can be correlated with Xit and Zk,it and can also be

strongly autocorrelated across iand over t. Of course, further restrictions on Γit are required

to allow estimation and inference on β. For example, the additive ﬁxed eﬀect model imposes

that Γit =αi+γt, where αiaccounts for any omitted variable that is constant over time, and

γtfor any omitted variable that is constant across units. Instead of this additive ﬁxed eﬀect

model we consider the so-called interactive ﬁxed eﬀect model, where

Γit =

r=1

λir ftr .(2)

Here, the λir and ftr can either be interpreted as unknown parameters or as unobserved

shocks. This model for Γit is also known as a factor model, with factor loadings λir and

factors ftr. We will use the terms factor and interactive ﬁxed eﬀect interchangeably. The

number of factors Ris unknown, but is assumed to be small relative to Nand T. The

interactive ﬁxed eﬀect model is attractive because it introduces enough restrictions to allow

estimation and inference on βwhile still incorporating or approximating a large class of data

generating processes (DGPs) for Γit.

The existing econometrics literature on panel regressions with interactive ﬁxed eﬀects is

quite large. Since the seminal work of Pesaran (2006) and Bai (2009), developing tools for

estimation and inference on βin model (1)-(2) under large Nand large Tasymptotics has

been a primary focus of this literature. Speciﬁcally, Pesaran (2006) introduces the common

correlated eﬀects (CCE) estimator, which uses cross-sectional averages of the observed vari-

ables as proxies for the unobserved factors. Bai (2009) derives the large N,Tproperties of

the least-squares (LS) estimator that jointly minimizes the sum of squared residuals over the

regression coeﬃcients, factors, and factor loadings.1

Bai (2009) shows that, under appropriate assumptions, the LS estimator for the regression

1This estimator was ﬁrst introduced by Kiefer (1980).

-0.05 -0.04 -0.03 -0.02 -0.01 0 0.01 0.02 0.03 0.04 0.05

Debiased

(a) No ID (κ= 0.00)

-0.04 -0.02 0 0.02 0.04 0.06 0.08

Debiased

(b) Weak ID (κ= 0.10)

-0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

Debiased

-0.04 -0.02 0 0.02 0.04 0.06

Debiased

(d) Strong ID (κ= 1.00)

Figure 1: Finite sample distributions of the LS and the debiased estimators, N= 100, T= 50, R= 1

coeﬃcients is √NT -consistent and asymptotically normally distributed as both Nand T

grow to inﬁnity. One of the key assumptions imposed for this result is the so-called “strong

factor assumption”, which requires all the factor loadings λir and factors ftr to have suﬃcient

variation across iand over t, respectively. If the strong factor assumption is violated, then the

LS estimator for λir and ftr may be unable to pick up the true loadings and factors correctly,

because the “weak factors”2in Γit cannot be distinguished from the noise in Uit. This can

lead to substantial bias and misleading inference, due to omitted variables bias from Γit that

is not picked up by the estimator.

To illustrate how this can lead to problems with conventional estimates and CIs for β,

Figure 1presents a subset of the results of our Monte Carlo study.3When the factors are

nonexistent (panel a) or strongly identiﬁed (panel d), the distribution of the LS estimator (in

blue) is centered at the true parameter value β(equal to 0 in this case). However, when the

2See, for example, Onatski (2010,2012) for a discussion and formalization of the notion of weak factors.

3A detailed description of the numerical experiment is provided in Section 5.1.

factors are present but weak enough that they are diﬃcult to estimate (panels b and c), the

LS estimator is heavily biased and non-normally distributed. In our Monte Carlo study in

Section 5, we show that this indeed leads to severe coverage distortion, with conventional CIs

based on the LS estimator having almost zero coverage.

In this paper, we address this issue by developing new tools for estimation and inference

on βin the model (1). We develop a debiased estimator along with a bound on the remaining

bias, which we use to construct a bias-aware conﬁdence interval. As illustrated in Figure 1, our

debiased estimator (shown in red) substantially decreases the bias of the LS estimator when

factors are weak, leading to a large improvement in overall estimation error. In addition, this

improved performance under weak factors does not come at a substantial cost to performance

when factors are strong or nonexistent: our debiased estimator performs similarly to the LS

estimator in these cases. Importantly, our CI requires only an upper bound on the number

of factors: we show that it is valid uniformly over a large class of DGPs that allows for weak,

strong or nonexistent factors up to a speciﬁed upper bound on the number of factors. We

derive rates of convergence that hold uniformly over this class of DGPs, and we show that our

estimator achieves a faster uniform rate of convergence than existing approaches when weak

factors are allowed. In the case where Nand Tgrow at the same rate, our estimator achieves

the parametric √NT rate.

Our debiasing approach uses a preliminary estimate ˆ

Γpre of the individual eﬀect matrix

Γ along with a bound ˆ

Con the nuclear norm ∥Γ−ˆ

Γpre∥∗of its estimation error. Letting

Γ := Γ −ˆ

Γpre, we then consider the augmented outcomes

Yit := Yit −ˆ

Γpre,it =Xitβ+

k=1

Zk,itδk+˜

Γit +Uit.

Treating ˜

Γit as nuisance parameters satisfying a convex constraint ∥˜

Γ∥∗≤ˆ

C, we derive linear

weights Ait such that the estimator PN

i=1 PT

t=1 Ait ˜

Yit for βoptimally uses this constraint,

using the theory of minimax linear estimators (see Ibragimov and Khas’minskii,1985;Donoho,

1994;Armstrong and Koles´ar,2018). In particular, the resulting weights Ait control the

remaining omitted variables bias PN

i=1 PT

t=1 Ait ˜

Γit due to possible weak factors in ˜

Γ=Γ−ˆ

Γpre

not picked up by the initial estimate ˆ

Γpre.

A key step in deriving our CI is the construction of the preliminary estimator ˆ

Γpre and

bound ˆ

Con the nuclear norm of its estimation error. Our CI is bias-aware: it uses the bound

Cto explicitly take into account any remaining bias in the debiased estimator. Our bound is

feasible once an upper bound on the number of factors is speciﬁed. In our Monte Carlo study,

we ﬁnd that, while our CIs are often conservative, they are about as wide as an “oracle” CI

that uses an infeasible critical value to correct the coverage of a CI based on the standard LS

estimator.

While our results allow for arbitrary sequences of weak factors, our conditions on other

aspects of the model are similar to Bai (2009) and Moon and Weidner (2015). An important

condition is that the covariate of interest Xit must not itself be entirely explained by a low

dimensional factor model. This rules out settings where Xit is an indicator variable for a

policy that change that aﬀects a subset of the units and occurs only during a single time

period: in this case, Xit =λi·ftwhere λiis an indicator variable for unit iundergoing the

policy change and ftis an indicator variable for periods after the policy change. For example,

in a panel where Xit is the minimum hourly wage in state iand year t, we would require that

states change their minimum wage laws in diﬀerent years, and that this is done suﬃciently

often to generate variation in Xit that cannot be explained by a small number of factors ft.

See Section 4for formal conditions and further discussion.

A special case of the factor model is the grouped unobserved heterogeneity model consid-

ered by Bonhomme and Manresa (2015). In this model, Γit =αg(i),t, where g(·) is an unknown

function mapping individuals ito a group index g(i)∈ {1, . . . , R}. This takes the form of the

factor model (2) with λir = 1 if g(i) = rand 0 otherwise, and with ftr =αrt. The strong

factor assumption corresponds to the strong group separation assumption imposed in this

literature (e.g., Assumption 2(b) in Bonhomme and Manresa,2015) which imposes that the

group means αr,·= (αr1, . . . , αrt)′are suﬃciently far away for diﬀerent groups r. Our results

apply in this setting and allow for this assumption to be relaxed. An interesting question

for future research is whether it is possible to modify our approach to take advantage of the

additional structure in the grouped unobserved heterogeneity model.

Related literature

The papers by Pesaran (2006) and Bai (2009) mentioned previously have motivated a large

follow up literature on large Nand Tanalysis of panel models with interactive eﬀects. Bai

and Wang (2016) provides a review with further references. Another literature has proposed

alternative estimation methods along with asymptotic analysis in the regime with Tﬁxed and

Nincreasing. This includes the quasi-diﬀerence approach of Holtz-Eakin, Newey and Rosen

(1988) and generalized method of moments approaches of Ahn, Lee and Schmidt (2001,2013).

More recent papers analyzing the ﬁxed Tlarge Nregime include Robertson and Saraﬁdis

(2015), Juodis and Saraﬁdis (2018), Westerlund, Petrova and Norkute (2019), Higgins (2021),

Juodis and Saraﬁdis (2022). None of these papers provide inference methods that remain valid

when factors are weak or rank-deﬁcient (e.g. f= 0). Chamberlain and Moreira (2009) derive

estimators that satisfy a Bayes-minimax property over a certain class of priors in a ﬁnite

sample setting that includes a version of the model (2). This Bayes-minimax property does

not, however, translate to a guarantee on coverage or estimation error under weak factors.

A special case of the violation of the strong factor assumption is when some factor are

equal to zero, while all other factors are strong; the inference results of Bai (2009) are usually

robust towards this speciﬁc violation of the strong factor assumption (Moon and Weidner,

2015). This robustness, however, does not carry over to more general weak factors in the

DGP of Γit, as illustrated by Figure 1.

The problem of weak factors is related to the problem of omitted variable bias of LASSO

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

RobustEstimationandInferenceinPanelswithInteractiveFixedEffects∗TimothyB.Armstrong‡MartinWeidner§AndreiZeleneev¶December2024AbstractWeconsiderestimationandinferenceforaregressioncoefficientinpanelswithinteractivefixedeffects(i.e.,withafactorstructure).Wedemonstratethatexistingestimatorsandconfidence...

展开>> 收起<<

Robust Estimation and Inference in Panels with Interactive Fixed Effects Timothy B. ArmstrongMartin WeidnerAndrei Zeleneev.pdf

共63页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Robust Estimation and Inference in Panels with Interactive Fixed Effects Timothy B. ArmstrongMartin WeidnerAndrei Zeleneev

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: