Estimating heterogeneous treatment eects versus building individualized treatment rules Connection and disconnection

2025-04-29 0 0 277.6KB 13 页 10玖币
侵权投诉
Estimating heterogeneous treatment effects
versus building individualized treatment rules:
Connection and disconnection
Zhongyuan Chen and Jun Xie
Department of Statistics, Purdue University
150 N. University Street, West Lafayette, IN 47907
junxie@purdue.edu
October 5, 2022
Abstract
Estimating heterogeneous treatment effects is a well studied topic in the statistics lit-
erature. More recently, it has regained attention due to an increasing need for precision
medicine as well as the increased use of state-of-art machine learning methods in the estima-
tion. Furthermore, estimating heterogeneous treatment effects is directly related to building
an individualized treatment rule, which is a decision rule of treatment according to patient
characteristics. This paper examines the connection and disconnection between these two
research problems. Notably, a better estimation of the heterogeneous treatment effects may
or may not lead to a better individualized treatment rule. We provide theoretical frame-
works to explain the connection and disconnection and demonstrate two different scenarios
through simulations. Our conclusion sheds light on a practical guide that under certain cir-
cumstances, there is no need to enhance estimation of the treatment effects, as it does not
alter the treatment decision.
Keywords: Heterogeneous treatment effects, Individualized treatment rules, Mean squared error,
classification error
1
arXiv:2210.01342v1 [stat.ME] 4 Oct 2022
1 Introduction
Estimation of heterogeneous treatment effects (HTE) is a research problem commonly raised in
many fields, including politics, economics, education, and healthcare. Instead of an overall treat-
ment effect, heterogeneity exists for subgroups or individuals within a population. We represent
HTE by the conditional average treatment effect (CATE) function given a set of covariates. There
is a rich literature on estimating the CATE, mostly through regression methods, as well as more
recently developed machine learning algorithms [Kunzel 2019]. In theory, the accuracy of a CATE
estimator is measured by the expected mean squared error (EMSE), which is based on the quadratic
loss function. The convergence rate of the EMSE can be used to compare different CATE estima-
tors.
Estimating heterogeneous treatment effects is critical for making medical decisions, such as what
treatment to recommend. Building personalized treatment, also referred to as an individualized
treatment rule (ITR) [Qian and Murphy 2011], is usually done through estimating the CATE first
and then defining the optimal ITR as the sign function of the CATE (for two treatment options
denoted as 1 and 1). [Chen et al. 2022] provided a good introduction on building ITRs and
called the aforementioned method an indirect approach. Although related, estimating the CATE
and building optimal ITRs are separately studied in the literature. In this paper, we elaborate upon
the connection and disconnection between the two research areas with mathematical frameworks.
In Section 2, we review recent developments on the estimation of the CATE, including the use of
a machine learning method named X-learner [Kunzel 2019], which can improve estimation accuracy
with a faster rate of convergence than other conventional estimators. In Section 3, we connect
estimating the CATE to building an ITR and show that the optimal ITR is indeed the sign function
of the true CATE function. With this relationship, it is natural to expect that better estimators,
in terms of smaller EMSE, lead to improved ITRs. This, however, does not always happen due
to the mismatch between the loss functions of these two research problems. More specifically, we
use the quadratic loss function for estimation of the CATE but the 0-1 loss function for comparing
ITRs. In Section 4, we examine the disconnection through a mathematical framework. That is,
for many cases, improving the CATE estimators does not change the corresponding ITRs. Our
conclusion sheds light on a practical guide that in certain situations, there is no need to enhance
estimation of the CATE, as it does not alter the treatment decision. Section 5 provides simulation
2
examples to display the connection and disconnection.
2 Recent development on estimation of HTE
Suppose we have data from a two-arm clinical trial with (X, A, Y ), where YRdenotes a treat-
ment response variable (the larger value the better), XXRpis a set of covariates, and
A∈ A ={−1,1}denotes the treatment index corresponding to the control or treatment arm. We
assume (X, A, Y )∼ P, where Pis the distribution from a specific family. Denote the conditional
mean E(Y|X, A). Define
µ0(x) = E(Y|X=x, A =1) and µ1(x) = E(Y|X=x, A = 1).
Then, the CATE function is τ(x) = µ1(x)µ0(x). Let ˆτ(x) be an estimator of the CATE from
a set of independent random data from P. We are interested in estimators with a small expected
mean squared error (EMSE):
EMSE(P,ˆτ) = E[(ˆτ(X)τ(X))2],
where the expectation is taken over ˆτand X, which are assumed independent of each other, e.g.,
ˆτis estimated from a training data set and Xdenotes a new data.
Various methods are available to estimate the CATE. The most commonly used one is to fit re-
gression models for ˆµ1(x) and ˆµ0(x), and then ˆτ(x) = ˆµ1(x)ˆµ0(x). As an addition to the rich litera-
ture, [Chen et al. 2022] considered very general regression models and applied dimension reduction
for high-dimensional covariates. Moreover, supervised learning algorithms are used to estimate ˆµ1
and ˆµ0by the machine learning community [Hu et al. 2021], including Bayesian Additive Regres-
sion Trees (BART) [Chipman et al. 2010] and Random Forest (RF) [Wager and Athey 2018]. One
of the most recent developments is an algorithm called X-learner [Kunzel 2019], which specifically
exploits structural properties of the CATE function for an improved estimator.
Following the results of [Kunzel 2019], we use the convergence rates of the EMSE to compare
different CATE estimators. Suppose we observe independent and identically distributed data
(Xi, Ai, Yi)∼ P,i= 1, . . . , N, with mcontrol units and ntreated units, N=m+n. Let ndenote
the smaller sample size of the two treatment arms and assume mand nhave a similar scale. Most
of the estimation methods have a convergence rate that depends on the estimators of ˆµ1and ˆµ0.
3
摘要:

Estimatingheterogeneoustreatmente ectsversusbuildingindividualizedtreatmentrules:ConnectionanddisconnectionZhongyuanChenandJunXieDepartmentofStatistics,PurdueUniversity150N.UniversityStreet,WestLafayette,IN47907junxie@purdue.eduOctober5,2022AbstractEstimatingheterogeneoustreatmente ectsisawellstudie...

展开>> 收起<<
Estimating heterogeneous treatment eects versus building individualized treatment rules Connection and disconnection.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:277.6KB 格式:PDF 时间:2025-04-29

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注