
2 J. Groß & A. M¨oller
statistic tby the formula
(2) d=trn1+n2
n1n2
,
see (2.5.3) in Cohen (1988). According to Cohen, values |d|= 0.2, |d|= 0.5 and |d|= 0.8
indicate a small, medium and large effect, respectively.
It may also be of interest to have a corresponding measure when the variable ydepends on
further independent variables. In his Chapter 9, Cohen (1988) deals with such a multiple re-
gression situation and discusses the effect size measure f2at length, as will further be explicated
in our Section 4.
However, an analogous measure to dis rare to find, see Wilson (2016, Sect. 3.14), Lipsey
and Wilson (2001) for such a proposal. Nonetheless, it may be of particular interest to have
comparable measures of an effect size for the very same grouped variable ybut additionally
depending on different sets of independent variables. This is exemplarily carried out in our
Section 5. In the following we introduce such a measure as a generalization to dby considering
a linear regression model
(3) y=β0+β1z+β2x1+··· +βw+1xw+ε ,
where ztakes the value zi= 0 if the corresponding observation yiof the dependent variable
ybelongs to group 1 and zi= 1 if yibelongs to group 2, i= 1, . . . , n1+n2. It is assumed
that there are windependent variables x1, . . . , xw. The error variable εis assumed to follow a
normal distribution with expectation 0 and variance σ2.
As will be shown in the following Sections 2, 3, and 4, a natural generalization of Cohen’s d
is given by
(4) d∗=y∗1−y∗2
qs2
∗1+s2
∗2
n1+n2−2−w
, s2
∗j=X
i
(y∗i−y∗j)2, j = 1,2,
where y∗=y−b
β2x1− ··· − b
βw+1xwis the dependent variable adjusted for the independent
variables. The b
βkare the ordinary least squares estimates of the regression coefficients βk,
k= 2, . . . , w + 1 in model (3). In case w= 0, the adjusted y∗coincides with the original y, so
that (4) reduces to (1) and therefore can be seen as a natural generalization of Cohen’s d.
2. Partitioned Linear Regression
Let n=n1+n2be the total sample size. The above model (3) may also be written in
vector-matrix notation as
(5) y=X1δ1+X2δ2+ε,
where now yrepresents the n×1 vector of observations of the dependent variable. Without
loss of generality it is assumed that the first n1observations belong to group 1, while the last
n2observations belong to group 2. By introducing the notation 1mfor an m×1 vectors of
ones, the n×2 matrix X1and the corresponding 2 ×1 parameter vector δ1may be written as
(6) X1=1n10
1n21n2and δ1=β0
β1.
The n×wmatrix X2contains the observations of the independent variables with corresponding
regression coefficients δT
2= (β2, . . . , βw+1), where the Tsuperscript denotes transposition. The
n×1 random vector εis assumed to follow a multivariate normal distribution with expectation
vector 0 and variance-covariance matrix σ2In, where Instands for the n×nidentity matrix. It
is assumed that the n×(2 + w) model matrix (X1, X2) has full column rank 2 + w. Equation
(5) represents a partitioned linear regression model as considered e.g. in Fiebig et al. (1996).
Generalizations and further properties are investigated by Puntanen (1996); Groß and Puntanen
(2000, 2005); Ding (2021), among others.