A NOTE ON COHENS D FROM A PARTITIONED LINEAR REGRESSION MODEL JURGEN GROSS

2025-04-30 0 0 333.13KB 8 页 10玖币

侵权投诉

A NOTE ON COHEN’S D FROM A PARTITIONED LINEAR REGRESSION

MODEL

J¨

URGEN GROSS

Institute for Mathematics and Applied Informatics, University of Hildesheim, Germany

ANNETTE M ¨

OLLER

Faculty of Business Administration and Economics, Bielefeld University, Germany

Abstract. In this note we introduce a generalized formula for Cohen’s dunder the presence of

additional independent variables, providing a measure for the size of a possible eﬀect concerning

the location diﬀerence of a variable in two groups. This is done by employing the so-called

Frisch-Waugh-Lovell theorem in a partitioned linear regression model. The generalization is

motivated by demonstrating the relationship to appropriate tand Fstatistics. Our discussion

is further illustrated by inference from a publicly available data set.

1. Introduction

When applying statistical testing of hypotheses to data it is often recommended not only to

report the corresponding p-value, but in addition to provide a measure for the eﬀect associated

with a possible rejection of the null hypothesis, see e.g. Wilkinson (1999). Such a measure

may be useful when sample sizes are to be ﬁxed during the planning phase of a study, or when

it is desired to assess the relevance of an actual rejection when given sample sizes are large.

Eﬀect size measures are strongly related to power analysis as carried out in the seminal book

by Cohen (1988).

A widely used measure is the so-called Cohen’s d, see also Hedges (1981); Kraemer (1983),

which is an eﬀect size measure for the two-sample ttest with equal variances. Consider inde-

pendent samples of sizes n1and n2of a statistical variable yin two groups such that yfollows

a normal distribution with expectation µ1and variance σ2in group 1 and expectation µ2and

the same variance σ2in group 2. Let tdenote the usual two-sample test statistic for the null

hypotheses H0:µ1=µ2versus the alternative H1:µ16=µ2. As a measure for the size of an

eﬀect, Cohen (1988, p. 66ﬀ) considers the absolute value of

(1) d=y1−y2

qs2

1+s2

n1+n2−2

where yjis the sample mean in group j,j= 1,2, and s2

j=Pi(yi−yj)2, where summation is

carried out with respect to all observation from group j. The eﬀect size dis related to the test

E-mail addresses:juergen.gross@uni-hildesheim.de, annette.moeller@uni-bielefeld.de.

2010 Mathematics Subject Classiﬁcation. 62J20, 62F03, 91C99.

Key words and phrases. Hypothesis testing, eﬀect size, Cohen’s d, partitioned linear regression, Frisch-

Waugh-Lovell theorem, multivariate normal distribution.

Support of the second author by the Helmholtz Association’s pilot project ”Uncertainty Quantiﬁcation” is

gratefully acknowledged.

arXiv:2210.13048v1 [stat.AP] 24 Oct 2022

2 J. Groß & A. M¨oller

statistic tby the formula

(2) d=trn1+n2

n1n2

see (2.5.3) in Cohen (1988). According to Cohen, values |d|= 0.2, |d|= 0.5 and |d|= 0.8

indicate a small, medium and large eﬀect, respectively.

It may also be of interest to have a corresponding measure when the variable ydepends on

further independent variables. In his Chapter 9, Cohen (1988) deals with such a multiple re-

gression situation and discusses the eﬀect size measure f2at length, as will further be explicated

in our Section 4.

However, an analogous measure to dis rare to ﬁnd, see Wilson (2016, Sect. 3.14), Lipsey

and Wilson (2001) for such a proposal. Nonetheless, it may be of particular interest to have

comparable measures of an eﬀect size for the very same grouped variable ybut additionally

depending on diﬀerent sets of independent variables. This is exemplarily carried out in our

Section 5. In the following we introduce such a measure as a generalization to dby considering

a linear regression model

(3) y=β0+β1z+β2x1+··· +βw+1xw+ε ,

where ztakes the value zi= 0 if the corresponding observation yiof the dependent variable

ybelongs to group 1 and zi= 1 if yibelongs to group 2, i= 1, . . . , n1+n2. It is assumed

that there are windependent variables x1, . . . , xw. The error variable εis assumed to follow a

normal distribution with expectation 0 and variance σ2.

As will be shown in the following Sections 2, 3, and 4, a natural generalization of Cohen’s d

is given by

(4) d∗=y∗1−y∗2

qs2

∗1+s2

∗2

n1+n2−2−w

, s2

∗j=X

(y∗i−y∗j)2, j = 1,2,

where y∗=y−b

β2x1− ··· − b

βw+1xwis the dependent variable adjusted for the independent

variables. The b

βkare the ordinary least squares estimates of the regression coeﬃcients βk,

k= 2, . . . , w + 1 in model (3). In case w= 0, the adjusted y∗coincides with the original y, so

that (4) reduces to (1) and therefore can be seen as a natural generalization of Cohen’s d.

2. Partitioned Linear Regression

Let n=n1+n2be the total sample size. The above model (3) may also be written in

vector-matrix notation as

(5) y=X1δ1+X2δ2+ε,

where now yrepresents the n×1 vector of observations of the dependent variable. Without

loss of generality it is assumed that the ﬁrst n1observations belong to group 1, while the last

n2observations belong to group 2. By introducing the notation 1mfor an m×1 vectors of

ones, the n×2 matrix X1and the corresponding 2 ×1 parameter vector δ1may be written as

(6) X1=1n10

1n21n2and δ1=β0

β1.

The n×wmatrix X2contains the observations of the independent variables with corresponding

regression coeﬃcients δT

2= (β2, . . . , βw+1), where the Tsuperscript denotes transposition. The

n×1 random vector εis assumed to follow a multivariate normal distribution with expectation

vector 0 and variance-covariance matrix σ2In, where Instands for the n×nidentity matrix. It

is assumed that the n×(2 + w) model matrix (X1, X2) has full column rank 2 + w. Equation

(5) represents a partitioned linear regression model as considered e.g. in Fiebig et al. (1996).

Generalizations and further properties are investigated by Puntanen (1996); Groß and Puntanen

(2000, 2005); Ding (2021), among others.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ANOTEONCOHEN'SDFROMAPARTITIONEDLINEARREGRESSIONMODELJURGENGROSSInstituteforMathematicsandAppliedInformatics,UniversityofHildesheim,GermanyANNETTEMOLLERFacultyofBusinessAdministrationandEconomics,BielefeldUniversity,GermanyAbstract.InthisnoteweintroduceageneralizedformulaforCohen'sdunderthepresence...

展开>> 收起<<

A NOTE ON COHENS D FROM A PARTITIONED LINEAR REGRESSION MODEL JURGEN GROSS.pdf

共8页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A NOTE ON COHENS D FROM A PARTITIONED LINEAR REGRESSION MODEL JURGEN GROSS

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: