Doubly Robust Proximal Synthetic Controls Hongxiang Qiu1 Xu Shi2 Wang Miao3 Edgar Dobriban4 and Eric Tchetgen Tchetgen4

2025-04-27 0 0 2.06MB 65 页 10玖币
侵权投诉
Doubly Robust Proximal Synthetic Controls
Hongxiang Qiu1, Xu Shi2, Wang Miao3, Edgar Dobriban4, and Eric
Tchetgen Tchetgen4
1Department of Epidemiology and Biostatistics, Michigan State
University
2Department of Biostatistics, University of Michigan
3Department of Probability and Statistics, Peking University
4Department of Statistics and Data Science, University of Pennsylvania
Abstract
To infer the treatment effect for a single treated unit using panel data, syn-
thetic control methods construct a linear combination of control units’ outcomes
that mimics the treated unit’s pre-treatment outcome trajectory. This linear com-
bination is subsequently used to impute the counterfactual outcomes of the treated
unit had it not been treated in the post-treatment period, and used to estimate
the treatment effect. Existing synthetic control methods rely on correctly modeling
certain aspects of the counterfactual outcome generating mechanism and may re-
quire near-perfect matching of the pre-treatment trajectory. Inspired by proximal
causal inference, we obtain two novel nonparametric identifying formulas for the
average treatment effect for the treated unit: one is based on weighting, and the
other combines models for the counterfactual outcome and the weighting function.
We introduce the concept of covariate shift to synthetic controls to obtain these
identification results conditional on the treatment assignment. We also develop
two treatment effect estimators based on these two formulas and the generalized
method of moments. One new estimator is doubly robust: it is consistent and
asymptotically normal if at least one of the outcome and weighting models is cor-
rectly specified. We demonstrate the performance of the methods via simulations
and apply them to evaluate the effectiveness of a Pneumococcal conjugate vaccine
on the risk of all-cause pneumonia in Brazil.
Author e-mail addresses: qiuhongx@msu.edu,shixu@umich.edu,mwfy@pku.edu.cn,
dobriban@wharton.upenn.edu,ett@wharton.upenn.edu
1
arXiv:2210.02014v7 [stat.ME] 6 May 2024
1 Introduction
1.1 Background
Interventions such as policies are often implemented in a single unit such as a state, a
city, or a school. Causal inference in these cases is challenging due to the small number
of treated units, and due to the lack of randomization and independence. In various
fields including economics, public health, and biometry, synthetic control (SC) methods
[Abadie and Gardeazabal, 2003, Abadie et al., 2010, 2015, Doudchenko and Imbens, 2016]
are a common tool to estimate the intervention (or treatment) effect for the treated unit
in time series from a single treated unit and multiple untreated units in both pre- and
post-treatment periods. For example, SC methods have been used to estimate the effects
of terrorist conflicts on GDP [Abadie and Gardeazabal, 2003], tobacco control program
on tobacco consumption [Abadie et al., 2010], Kansas’s tax cut on GDP [Ben-Michael
et al., 2021b, Rickman and Wang, 2018], Florida’s “stand your ground” law on homicide
rates [Bonander et al., 2021], and pneumococcal conjugate vaccines on pneumonia [Bruhn
et al., 2017].
Classical SCs are linear combinations of control units that mimic the treated unit
before the treatment. Outcome differences between the treated unit and the SC in the
post-treatment period are used to make inferences about the treatment effect for the
treated unit. In Abadie and Gardeazabal [2003] and Abadie et al. [2010], a SC is a
weighted average of a pool of control units, called the donors. The weights are obtained
by minimizing a distance between the SC and the treated unit in the pre-treatment period,
under the constraint that the weights are non-negative and sum to unity. Many extensions
have been proposed. For example, Abadie and L’Hour [2021] proposed methods for
multiple treated units, and Ben-Michael et al. [2021a] further considered the case where
these treated units initiate treatment at different time points; Doudchenko and Imbens
[2016] and Ben-Michael et al. [2021a,b] introduced penalization to improve performance;
Athey et al. [2021] and Bai and Ng [2021] used techniques from matrix completion;
Li [2020] studied statistical inference for SC methods; Chernozhukov et al. [2021] and
Cattaneo et al. [2021] considered prediction intervals for treatment effects. Among these
extensions, some also incorporate the idea that, similarly to the control units’ outcomes
in the post-treatment period, the treated unit’s outcomes in the pre-treatment period
can be used to impute the counterfactual outcome had it not been treated [Ben-Michael
et al., 2021b, Arkhangelsky et al., 2021].
Existing methods often rely on assuming linear models and on the existence of near-
perfectly matching weights in the observed data. Under such assumptions, valid SCs are
linear combinations, often weighted averages, of donors. However, if such assumptions do
not hold, these methods may not produce a valid SC. This may happen if the outcomes
in the donors have a different measuring scale from the treated unit, or if the treated
unit’s and the donors’ outcomes have a nonlinear relationship.
2
To relax these assumptions, Shi et al. [2023] viewed SCs from the proximal causal
inference perspective. For independent and identically distributed (i.i.d.) observations,
Miao et al. [2018], Deaner [2018, 2021], Cui et al. [2020], Tchetgen Tchetgen et al. [2020]
derived nonparametric identification using proxies, variables capturing the effect of the
unmeasured confounders. Shi et al. [2023] viewed control units’ outcomes as proxies and
obtained nonparametric identification results for the potential outcome of the treated
unit had it not been treated as well as the treatment effect in a general setting, beyond
the common linear factor model [e.g., Abadie et al., 2010]. They assumed the existence
of a function of these proxies, termed confounding bridge function, that captures the
(possibly nonlinear) effects of unobserved confounders. With this function, they imputed
the expected counterfactual outcomes for the treated unit. Estimation of, and inference
about, the average treatment effect for the treated unit (ATT) followed from this iden-
tification result. Instrumental variables have also been used. For example, Holtz-Eakin
et al. [1988] considered a linear model with interactive fixed effects and showed how to
identify it using appropriate instruments. The solution to this problem relies on a partic-
ular differencing strategy, which may be viewed as an application of a confounding bridge
function. Cunha et al. [2010], Freyberger [2018] and references therein considered general
nonparametric models with interactive effects, showing how to identify it using appropri-
ate instruments. While treatment confounding proxies in proximal causal inference are
sometimes described as instruments, it is crucial to note that they are more general than
instrumental variables (IVs), in the sense that valid IVs are valid treatment confound-
ing proxies, but invalid IVs dependent on hidden confounders are also valid treatment
confounding proxies [Tchetgen Tchetgen et al., 2020]. In addition, while IVs require a
form of homogeneity condition for nonparametric identification (e.g., separable errors or
monotonicity), proxies do not require such a condition.
1.2 Our contribution
Existing methods rely on correctly specifying an outcome model, based on which one can
impute the counterfactual outcome trajectory of the treated unit, had it not been treated,
after treatment. This outcome bridge function model may be difficult to specify correctly,
or may not exist. In this paper, we relax this requirement by leveraging the proximal
causal inference framework as in Shi et al. [2023]. We develop two novel methods to
estimate the ATT. One method relies on weighting and is a building block to a second
method which we rigorously prove is doubly robust [Bang and Robins, 2005, Scharfstein
et al., 1999]. It is consistent and asymptotically normal if either the outcome model
or the weighting function is correctly specified, without requiring that both are. An
advantage of the doubly robust method compared to existing methods is that it allows
for misspecifing one of the two models, without the user necessarily knowing which might
be misspecified.
We observed that our estimand of interest, the ATT, is closely related to the average
3
treatment effect on the treated for i.i.d. data [e.g., Hahn, 1998, Imbens, 2004, Chen
et al., 2008, Shu and Tan, 2018]. The method in Shi et al. [2023] corresponds to using
an outcome confounding bridge function [Miao et al., 2018], which is the proximal causal
inference counterpart of G-computation, or an outcome regression-based approach in
causal inference under unconfoundedness [Robins, 1986]. Our proposed methods are
motivated by the existing identification results in proximal causal inference in the i.i.d.
setting [Cui et al., 2020]: one result is based on weighting and the other is based on the
influence function.
Despite these similarities, it remains challenging to adapt these ideas from the i.i.d.
setting to panel data. Since treatment assignment is often viewed as fixed in SC prob-
lems, a key concept from the i.i.d. setting, the propensity score [Rosenbaum and Rubin,
1983], is undefined. Thus, existing results for the i.i.d. setting cannot be directly applied
to SC problems. We leverage the notion of covariate shift [e.g., Qui˜nonero-Candela et al.,
2009] to circumvent this issue. We also find a relaxed version of the i.i.d. assumption
to allow for serial correlation, while still obtaining identification via weighting. We illus-
trate our proposed methods in simulations and three empirical examples: two examples
concern public health outcomes, one studying the effect of the PCV10 vaccine in Brazil
on pneumonia [Bruhn et al., 2017], and the other studying the effect of Florida’s “stand
your ground” law on homicide rates [Bonander et al., 2021]; the third example concerns
economic outcomes, studying the effect of Kansas’s tax cut on GDP [Rickman and Wang,
2018].
Both our doubly robust method and the method in Ben-Michael et al. [2021b] take
the form of augmented weighted moment equations, but the known robustness properties
of these methods differ. In Ben-Michael et al. [2021b], the treated unit’s counterfactual
outcome had it not been treated after treatment can be imputed with two approaches.
One is a weighted average of control units’ outcomes, identical to classical SC methods
[Abadie et al., 2010]; the other is a prediction model obtained using the treated unit’s out-
comes before treatment. By combining these two approaches, Ben-Michael et al. [2021b]
developed a SC method with improved performance. Arkhangelsky et al. [2021] proposed
a method combining two imputation approaches based on similar ideas. Nevertheless,
to date, neither method has formally been shown to be doubly robust. In contrast, we
formally establish double robustness, inherited from the influence function of the ATT in
i.i.d cases.
2 Problem setup
We observe data over Ttime periods. The first T0time periods are the pre-treatment
periods, and the last TT0time periods are the post-treatment periods. For the treated
unit, at each time period t= 1, . . . , T , let Yt(0), Yt(1) Rbe the counterfactual outcome
corresponding to no treatment and treatment, respectively, and Yt= (tT0)Yt(0) +
4
(t > T0)Yt(1) be the observed outcome. At each time period t, other variables such as
other control units’ outcomes are observed. We provide more details about these variables
below. We treat T,T0and the treated unit as deterministic, and treat other variables
such as the treated unit’s potential outcomes Yt(1) and Yt(0) as random. In other words,
our proposed methods are conditional on the study design. We study the ATT causal
estimand, that is,
ϕ(t) = E{Yt(1) Yt(0)}
in a post-treatment period t>T0. We treat times, namely Tand T0, and all units as
deterministic, and treat the potential outcomes as stochastic [Greenland, 1987, Robins
and Greenland, 1989, 2000, VanderWeele and Robins, 2012]; that is, Yt(0) and Yt(1) are
both stochastic processes indexed by tthat are randomly generated over time, rather
than fixed unknown scalar sequences. In the frequentist interpretation, under repeated
sampling, the times and units are all fixed and hence identical for all samples, but the
outcomes are randomly generated from a fixed unknown joint distribution and hence
may differ across samples. Stochastic counterfactuals are commonly assumed in the SC
literature, implicitly in the random noise or residuals of a linear latent factor model or
an autoregressive model [e.g., Abadie et al., 2010, Abadie and L’Hour, 2021, Ben-Michael
et al., 2021b,a, Athey et al., 2021]. This notion of stochastic counterfactuals Yt(0) and
Yt(1) as time series is required in our paper because the expectation in ϕ(t) is taken over
the joint distribution of (Yt(0), Yt(1)).
In the main text, we focus on the case without covariates, and discuss using covariates
in Web Appendix S5. We assume that all unmeasured confounding is captured by a
latent factor Ut, with assumptions stated in later sections. We use tand t+to denote
generic times before and after treatment, respectively; that is, tT0and t+> T0.
When stating asymptotic results, we consider the asymptotic regime where T→ ∞
with T0/T γ(0,1). This asymptotic regime may be interpreted as the number of
observations in both pre- and post-treatment periods growing to infinity, collecting more
data before and after the treatment time. Beyond this asymptotic regime, finite-sample
results concerning the error in pre-treatment fitting or treatment effect estimation have
been established in previous works [e.g., Abadie and L’Hour, 2021, Athey et al., 2021,
Ben-Michael et al., 2021a,b].
3 Review of identification via outcome modeling
In classical SC methods, a weighted average of a pool of control units called donors forms
the SC [Abadie et al., 2010]. The motivation for using control units’ outcomes to learn
about Yt+(0), despite the presence of potential unmeasured confounder Ut, is that these
control units may be affected by, and thus contains information about, Ut. This charac-
teristic resembles that of proxies in proximal causal inference. In an i.i.d. setting, proxies
5
摘要:

DoublyRobustProximalSyntheticControlsHongxiangQiu1,XuShi2,WangMiao3,EdgarDobriban4,andEricTchetgenTchetgen∗41DepartmentofEpidemiologyandBiostatistics,MichiganStateUniversity2DepartmentofBiostatistics,UniversityofMichigan3DepartmentofProbabilityandStatistics,PekingUniversity4DepartmentofStatisticsand...

展开>> 收起<<
Doubly Robust Proximal Synthetic Controls Hongxiang Qiu1 Xu Shi2 Wang Miao3 Edgar Dobriban4 and Eric Tchetgen Tchetgen4.pdf

共65页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:65 页 大小:2.06MB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 65
客服
关注