Robust Estimation and Inference in Panels with Interactive Fixed Effects Timothy B. ArmstrongMartin WeidnerAndrei Zeleneev

2025-04-24 0 0 859.71KB 63 页 10玖币
侵权投诉
Robust Estimation and Inference
in Panels with Interactive Fixed Effects
Timothy B. ArmstrongMartin Weidner§Andrei Zeleneev
December 2024
Abstract
We consider estimation and inference for a regression coefficient in panels with interactive
fixed effects (i.e., with a factor structure). We demonstrate that existing estimators and
confidence intervals (CIs) can be heavily biased and size-distorted when some of the factors
are weak. We propose estimators with improved rates of convergence and bias-aware CIs
that remain valid uniformly, regardless of factor strength. Our approach applies the theory
of minimax linear estimation to form a debiased estimate, using a nuclear norm bound on
the error of an initial estimate of the interactive fixed effects. Our resulting bias-aware CIs
take into account the remaining bias caused by weak factors. Monte Carlo experiments
show substantial improvements over conventional methods when factors are weak, with
minimal costs to estimation accuracy when factors are strong.
We thank the participants of the numerous seminars and conferences for helpful comments and sugges-
tions. We also thank Riccardo D’Adamo and Chen-Wei Hsiang for their excellent research assistance. Any
remaining errors are our own. Armstrong gratefully acknowledges support by the National Science Founda-
tion Grant SES-2049765. Weidner gratefully acknowledges support through the European Research Council
grant ERC-2018-CoG-819086-PANEDA. Zeleneev gratefully acknowledges the generous funding from the UK
Research and Innovation (UKRI) under the UK government’s Horizon Europe funding guarantee (Grant Ref:
EP/X02931X/1).
University of Southern California. Email: timothy.armstrong@usc.edu
§University of Oxford. Email: martin.weidner@economics.ox.ac.uk
University College London. Email: a.zeleneev@ucl.ac.uk
arXiv:2210.06639v3 [econ.EM] 11 Dec 2024
1 Introduction
In this paper, we consider a linear panel regression model of the form
Yit =Xitβ+
K
X
k=1
Zk,itδk+ Γit +Uit,(1)
where Yit, Xit, Z1,it, . . . , ZK,it Rare the observed outcome variable and covariates for units
i= 1, . . . , N and time periods t= 1, . . . , T . The error components Γit Rand Uit Rare
unobserved, and the regression coefficients β, δ1, . . . , δKRare unknown. The parameter of
interest is βR, the coefficient on Xit. We are interested in “large panels”, where both N
and Tare relatively large.
The error component Uit is modelled as a mean-zero random shock that is uncorrelated
with the regressors Xit and Zk,it and that is at most weakly autocorrelated across iand over
t. By contrast, the error component Γit can be correlated with Xit and Zk,it and can also be
strongly autocorrelated across iand over t. Of course, further restrictions on Γit are required
to allow estimation and inference on β. For example, the additive fixed effect model imposes
that Γit =αi+γt, where αiaccounts for any omitted variable that is constant over time, and
γtfor any omitted variable that is constant across units. Instead of this additive fixed effect
model we consider the so-called interactive fixed effect model, where
Γit =
R
X
r=1
λir ftr .(2)
Here, the λir and ftr can either be interpreted as unknown parameters or as unobserved
shocks. This model for Γit is also known as a factor model, with factor loadings λir and
factors ftr. We will use the terms factor and interactive fixed effect interchangeably. The
number of factors Ris unknown, but is assumed to be small relative to Nand T. The
interactive fixed effect model is attractive because it introduces enough restrictions to allow
estimation and inference on βwhile still incorporating or approximating a large class of data
generating processes (DGPs) for Γit.
The existing econometrics literature on panel regressions with interactive fixed effects is
quite large. Since the seminal work of Pesaran (2006) and Bai (2009), developing tools for
estimation and inference on βin model (1)-(2) under large Nand large Tasymptotics has
been a primary focus of this literature. Specifically, Pesaran (2006) introduces the common
correlated effects (CCE) estimator, which uses cross-sectional averages of the observed vari-
ables as proxies for the unobserved factors. Bai (2009) derives the large N,Tproperties of
the least-squares (LS) estimator that jointly minimizes the sum of squared residuals over the
regression coefficients, factors, and factor loadings.1
Bai (2009) shows that, under appropriate assumptions, the LS estimator for the regression
1This estimator was first introduced by Kiefer (1980).
1
-0.05 -0.04 -0.03 -0.02 -0.01 0 0.01 0.02 0.03 0.04 0.05
0
5
10
15
20
25
30
35
40
LS
Debiased
(a) No ID (κ= 0.00)
-0.04 -0.02 0 0.02 0.04 0.06 0.08
0
5
10
15
20
25
30
35
LS
Debiased
(b) Weak ID (κ= 0.10)
-0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
0
5
10
15
20
25
LS
Debiased
(c) Weak ID (κ= 0.20)
-0.04 -0.02 0 0.02 0.04 0.06
0
5
10
15
20
25
30
LS
Debiased
(d) Strong ID (κ= 1.00)
Figure 1: Finite sample distributions of the LS and the debiased estimators, N= 100, T= 50, R= 1
coefficients is NT -consistent and asymptotically normally distributed as both Nand T
grow to infinity. One of the key assumptions imposed for this result is the so-called “strong
factor assumption”, which requires all the factor loadings λir and factors ftr to have sufficient
variation across iand over t, respectively. If the strong factor assumption is violated, then the
LS estimator for λir and ftr may be unable to pick up the true loadings and factors correctly,
because the “weak factors”2in Γit cannot be distinguished from the noise in Uit. This can
lead to substantial bias and misleading inference, due to omitted variables bias from Γit that
is not picked up by the estimator.
To illustrate how this can lead to problems with conventional estimates and CIs for β,
Figure 1presents a subset of the results of our Monte Carlo study.3When the factors are
nonexistent (panel a) or strongly identified (panel d), the distribution of the LS estimator (in
blue) is centered at the true parameter value β(equal to 0 in this case). However, when the
2See, for example, Onatski (2010,2012) for a discussion and formalization of the notion of weak factors.
3A detailed description of the numerical experiment is provided in Section 5.1.
2
factors are present but weak enough that they are difficult to estimate (panels b and c), the
LS estimator is heavily biased and non-normally distributed. In our Monte Carlo study in
Section 5, we show that this indeed leads to severe coverage distortion, with conventional CIs
based on the LS estimator having almost zero coverage.
In this paper, we address this issue by developing new tools for estimation and inference
on βin the model (1). We develop a debiased estimator along with a bound on the remaining
bias, which we use to construct a bias-aware confidence interval. As illustrated in Figure 1, our
debiased estimator (shown in red) substantially decreases the bias of the LS estimator when
factors are weak, leading to a large improvement in overall estimation error. In addition, this
improved performance under weak factors does not come at a substantial cost to performance
when factors are strong or nonexistent: our debiased estimator performs similarly to the LS
estimator in these cases. Importantly, our CI requires only an upper bound on the number
of factors: we show that it is valid uniformly over a large class of DGPs that allows for weak,
strong or nonexistent factors up to a specified upper bound on the number of factors. We
derive rates of convergence that hold uniformly over this class of DGPs, and we show that our
estimator achieves a faster uniform rate of convergence than existing approaches when weak
factors are allowed. In the case where Nand Tgrow at the same rate, our estimator achieves
the parametric NT rate.
Our debiasing approach uses a preliminary estimate ˆ
Γpre of the individual effect matrix
Γ along with a bound ˆ
Con the nuclear norm Γˆ
Γpreof its estimation error. Letting
˜
Γ := Γ ˆ
Γpre, we then consider the augmented outcomes
˜
Yit := Yit ˆ
Γpre,it =Xitβ+
K
X
k=1
Zk,itδk+˜
Γit +Uit.
Treating ˜
Γit as nuisance parameters satisfying a convex constraint ˜
Γˆ
C, we derive linear
weights Ait such that the estimator PN
i=1 PT
t=1 Ait ˜
Yit for βoptimally uses this constraint,
using the theory of minimax linear estimators (see Ibragimov and Khas’minskii,1985;Donoho,
1994;Armstrong and Koles´ar,2018). In particular, the resulting weights Ait control the
remaining omitted variables bias PN
i=1 PT
t=1 Ait ˜
Γit due to possible weak factors in ˜
Γ=Γˆ
Γpre
not picked up by the initial estimate ˆ
Γpre.
A key step in deriving our CI is the construction of the preliminary estimator ˆ
Γpre and
bound ˆ
Con the nuclear norm of its estimation error. Our CI is bias-aware: it uses the bound
ˆ
Cto explicitly take into account any remaining bias in the debiased estimator. Our bound is
feasible once an upper bound on the number of factors is specified. In our Monte Carlo study,
we find that, while our CIs are often conservative, they are about as wide as an “oracle” CI
that uses an infeasible critical value to correct the coverage of a CI based on the standard LS
estimator.
While our results allow for arbitrary sequences of weak factors, our conditions on other
aspects of the model are similar to Bai (2009) and Moon and Weidner (2015). An important
3
condition is that the covariate of interest Xit must not itself be entirely explained by a low
dimensional factor model. This rules out settings where Xit is an indicator variable for a
policy that change that affects a subset of the units and occurs only during a single time
period: in this case, Xit =λi·ftwhere λiis an indicator variable for unit iundergoing the
policy change and ftis an indicator variable for periods after the policy change. For example,
in a panel where Xit is the minimum hourly wage in state iand year t, we would require that
states change their minimum wage laws in different years, and that this is done sufficiently
often to generate variation in Xit that cannot be explained by a small number of factors ft.
See Section 4for formal conditions and further discussion.
A special case of the factor model is the grouped unobserved heterogeneity model consid-
ered by Bonhomme and Manresa (2015). In this model, Γit =αg(i),t, where g(·) is an unknown
function mapping individuals ito a group index g(i)∈ {1, . . . , R}. This takes the form of the
factor model (2) with λir = 1 if g(i) = rand 0 otherwise, and with ftr =αrt. The strong
factor assumption corresponds to the strong group separation assumption imposed in this
literature (e.g., Assumption 2(b) in Bonhomme and Manresa,2015) which imposes that the
group means αr,·= (αr1, . . . , αrt)are sufficiently far away for different groups r. Our results
apply in this setting and allow for this assumption to be relaxed. An interesting question
for future research is whether it is possible to modify our approach to take advantage of the
additional structure in the grouped unobserved heterogeneity model.
Related literature
The papers by Pesaran (2006) and Bai (2009) mentioned previously have motivated a large
follow up literature on large Nand Tanalysis of panel models with interactive effects. Bai
and Wang (2016) provides a review with further references. Another literature has proposed
alternative estimation methods along with asymptotic analysis in the regime with Tfixed and
Nincreasing. This includes the quasi-difference approach of Holtz-Eakin, Newey and Rosen
(1988) and generalized method of moments approaches of Ahn, Lee and Schmidt (2001,2013).
More recent papers analyzing the fixed Tlarge Nregime include Robertson and Sarafidis
(2015), Juodis and Sarafidis (2018), Westerlund, Petrova and Norkute (2019), Higgins (2021),
Juodis and Sarafidis (2022). None of these papers provide inference methods that remain valid
when factors are weak or rank-deficient (e.g. f= 0). Chamberlain and Moreira (2009) derive
estimators that satisfy a Bayes-minimax property over a certain class of priors in a finite
sample setting that includes a version of the model (2). This Bayes-minimax property does
not, however, translate to a guarantee on coverage or estimation error under weak factors.
A special case of the violation of the strong factor assumption is when some factor are
equal to zero, while all other factors are strong; the inference results of Bai (2009) are usually
robust towards this specific violation of the strong factor assumption (Moon and Weidner,
2015). This robustness, however, does not carry over to more general weak factors in the
DGP of Γit, as illustrated by Figure 1.
The problem of weak factors is related to the problem of omitted variable bias of LASSO
4
摘要:

RobustEstimationandInferenceinPanelswithInteractiveFixedEffects∗TimothyB.Armstrong‡MartinWeidner§AndreiZeleneev¶December2024AbstractWeconsiderestimationandinferenceforaregressioncoefficientinpanelswithinteractivefixedeffects(i.e.,withafactorstructure).Wedemonstratethatexistingestimatorsandconfidence...

展开>> 收起<<
Robust Estimation and Inference in Panels with Interactive Fixed Effects Timothy B. ArmstrongMartin WeidnerAndrei Zeleneev.pdf

共63页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:63 页 大小:859.71KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 63
客服
关注