A NOTE ON COHENS D FROM A PARTITIONED LINEAR REGRESSION MODEL JURGEN GROSS

2025-04-30 0 0 333.13KB 8 页 10玖币
侵权投诉
A NOTE ON COHEN’S D FROM A PARTITIONED LINEAR REGRESSION
MODEL
J¨
URGEN GROSS
Institute for Mathematics and Applied Informatics, University of Hildesheim, Germany
ANNETTE M ¨
OLLER
Faculty of Business Administration and Economics, Bielefeld University, Germany
Abstract. In this note we introduce a generalized formula for Cohen’s dunder the presence of
additional independent variables, providing a measure for the size of a possible effect concerning
the location difference of a variable in two groups. This is done by employing the so-called
Frisch-Waugh-Lovell theorem in a partitioned linear regression model. The generalization is
motivated by demonstrating the relationship to appropriate tand Fstatistics. Our discussion
is further illustrated by inference from a publicly available data set.
1. Introduction
When applying statistical testing of hypotheses to data it is often recommended not only to
report the corresponding p-value, but in addition to provide a measure for the effect associated
with a possible rejection of the null hypothesis, see e.g. Wilkinson (1999). Such a measure
may be useful when sample sizes are to be fixed during the planning phase of a study, or when
it is desired to assess the relevance of an actual rejection when given sample sizes are large.
Effect size measures are strongly related to power analysis as carried out in the seminal book
by Cohen (1988).
A widely used measure is the so-called Cohen’s d, see also Hedges (1981); Kraemer (1983),
which is an effect size measure for the two-sample ttest with equal variances. Consider inde-
pendent samples of sizes n1and n2of a statistical variable yin two groups such that yfollows
a normal distribution with expectation µ1and variance σ2in group 1 and expectation µ2and
the same variance σ2in group 2. Let tdenote the usual two-sample test statistic for the null
hypotheses H0:µ1=µ2versus the alternative H1:µ16=µ2. As a measure for the size of an
effect, Cohen (1988, p. 66ff) considers the absolute value of
(1) d=y1y2
qs2
1+s2
2
n1+n22
,
where yjis the sample mean in group j,j= 1,2, and s2
j=Pi(yiyj)2, where summation is
carried out with respect to all observation from group j. The effect size dis related to the test
E-mail addresses:juergen.gross@uni-hildesheim.de, annette.moeller@uni-bielefeld.de.
2010 Mathematics Subject Classification. 62J20, 62F03, 91C99.
Key words and phrases. Hypothesis testing, effect size, Cohen’s d, partitioned linear regression, Frisch-
Waugh-Lovell theorem, multivariate normal distribution.
Support of the second author by the Helmholtz Association’s pilot project ”Uncertainty Quantification” is
gratefully acknowledged.
1
arXiv:2210.13048v1 [stat.AP] 24 Oct 2022
2 J. Groß & A. M¨oller
statistic tby the formula
(2) d=trn1+n2
n1n2
,
see (2.5.3) in Cohen (1988). According to Cohen, values |d|= 0.2, |d|= 0.5 and |d|= 0.8
indicate a small, medium and large effect, respectively.
It may also be of interest to have a corresponding measure when the variable ydepends on
further independent variables. In his Chapter 9, Cohen (1988) deals with such a multiple re-
gression situation and discusses the effect size measure f2at length, as will further be explicated
in our Section 4.
However, an analogous measure to dis rare to find, see Wilson (2016, Sect. 3.14), Lipsey
and Wilson (2001) for such a proposal. Nonetheless, it may be of particular interest to have
comparable measures of an effect size for the very same grouped variable ybut additionally
depending on different sets of independent variables. This is exemplarily carried out in our
Section 5. In the following we introduce such a measure as a generalization to dby considering
a linear regression model
(3) y=β0+β1z+β2x1+··· +βw+1xw+ε ,
where ztakes the value zi= 0 if the corresponding observation yiof the dependent variable
ybelongs to group 1 and zi= 1 if yibelongs to group 2, i= 1, . . . , n1+n2. It is assumed
that there are windependent variables x1, . . . , xw. The error variable εis assumed to follow a
normal distribution with expectation 0 and variance σ2.
As will be shown in the following Sections 2, 3, and 4, a natural generalization of Cohen’s d
is given by
(4) d=y1y2
qs2
1+s2
2
n1+n22w
, s2
j=X
i
(yiyj)2, j = 1,2,
where y=yb
β2x1− ··· − b
βw+1xwis the dependent variable adjusted for the independent
variables. The b
βkare the ordinary least squares estimates of the regression coefficients βk,
k= 2, . . . , w + 1 in model (3). In case w= 0, the adjusted ycoincides with the original y, so
that (4) reduces to (1) and therefore can be seen as a natural generalization of Cohen’s d.
2. Partitioned Linear Regression
Let n=n1+n2be the total sample size. The above model (3) may also be written in
vector-matrix notation as
(5) y=X1δ1+X2δ2+ε,
where now yrepresents the n×1 vector of observations of the dependent variable. Without
loss of generality it is assumed that the first n1observations belong to group 1, while the last
n2observations belong to group 2. By introducing the notation 1mfor an m×1 vectors of
ones, the n×2 matrix X1and the corresponding 2 ×1 parameter vector δ1may be written as
(6) X1=1n10
1n21n2and δ1=β0
β1.
The n×wmatrix X2contains the observations of the independent variables with corresponding
regression coefficients δT
2= (β2, . . . , βw+1), where the Tsuperscript denotes transposition. The
n×1 random vector εis assumed to follow a multivariate normal distribution with expectation
vector 0 and variance-covariance matrix σ2In, where Instands for the n×nidentity matrix. It
is assumed that the n×(2 + w) model matrix (X1, X2) has full column rank 2 + w. Equation
(5) represents a partitioned linear regression model as considered e.g. in Fiebig et al. (1996).
Generalizations and further properties are investigated by Puntanen (1996); Groß and Puntanen
(2000, 2005); Ding (2021), among others.
摘要:

ANOTEONCOHEN'SDFROMAPARTITIONEDLINEARREGRESSIONMODELJURGENGROSSInstituteforMathematicsandAppliedInformatics,UniversityofHildesheim,GermanyANNETTEMOLLERFacultyofBusinessAdministrationandEconomics,BielefeldUniversity,GermanyAbstract.InthisnoteweintroduceageneralizedformulaforCohen'sdunderthepresence...

展开>> 收起<<
A NOTE ON COHENS D FROM A PARTITIONED LINEAR REGRESSION MODEL JURGEN GROSS.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:333.13KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注