Maximum likelihood estimation for left-truncated log-logistic distributions with a given truncation point Markus Kreer1 Ay se Kzlers u2 Jake Guscott2 Lukas Christopher Schmitz3 and

2025-05-02 0 0 541.36KB 27 页 10玖币
侵权投诉
Maximum likelihood estimation for left-truncated log-logistic
distributions with a given truncation point
Markus Kreer1, Ay¸se Kızılers¨u2, Jake Guscott2, Lukas Christopher Schmitz3, and
Anthony W. Thomas2
1Feldbergschule, Oberh¨ochstadter Str. 20, 61440 Oberursel (Taunus), Germany.
2CSSM, Department of Physics, University of Adelaide, 5005, Adelaide, Australia.
3Institut f¨ur Mathematik, Johannes Gutenberg-Universit¨at Mainz, Staudingerweg 9,
55128 Mainz, Germany.
28th October 2022
Abstract
The maximum likelihood estimation of the left-truncated log-logistic distribution with a given
truncation point is analyzed in detail from both mathematical and numerical perspectives. These
maximum likelihood equations often do not possess a solution, even for small truncations. A simple
criterion is provided for the existence of a regular maximum likelihood solution. In this case a profile
likelihood function can be constructed and the optimisation problem is reduced to one dimension.
When the maximum likelihood equations do not admit a solution for certain data samples, it is shown
that the Pareto distribution is the L1-limit of the degenerated left-truncated log-logistic distribution.
Using this mathematical information, a highly efficient Monte Carlo simulation is performed to obtain
critical values for some goodness-of-fit tests. The confidence tables and an interpolation formula are
provided and several applications to real world data are presented.
1 Preliminaries
The log-logistic distribution, also known as the Fisk distribution, has been popular in the econometric
community since the early 1960s because of its better description of income distributions ([1]), as com-
pared with the Pareto distribution. On the other hand, hydrologists in the late 1980s suggested that
the log-logistic distributions were useful for modelling Canadian precipitation data ([2]), or flood fre-
quencies in Scotland for annual flood maxima ([3]). The log-logistic distribution is related to the logistic
distribution by a logarithmic transform. Practioners make use of log-logistic distributions because of its
easy calculability and closed form expressions for both cummulative probability distribution (cdf) and
probablity density function (pdf) [4],[5]. For x > 0 the pdf and cdf are given by
f(x|α, β) = β
αx
αβ11
[1 + (x/α)β]2
F(x|α, β) = 1
1+(x/α)β
where α > 0 is the scale parameter and β > 0 the shape parameter. We can recover the Pareto
distribution for the tail by expanding the cdf for “large” arguments, xα, because up to first order
F(x)'1(x/α)β.
In the analysis of annual flood maxima, the log-logistic distribution in Ref. [3] was modified by the
introduction of a threshold parameter for practical reasons: a flood maximum in a rainy country like
Scotland should always be above a certain threshold level. In this paper we pursue a different approach:
we keep the two-parameter distribution but introduce a fixed left-truncation point xL>0 instead. Thus
we have now for x>xLthe left-truncated log-logistic pdf and cdf respectively as (see also Refs. [6, 7, 8]
1
arXiv:2210.15155v1 [stat.ME] 27 Oct 2022
for obtaining a truncated distribution from a complete distribution)
fLT (x|α, β;xL) = 1 + xL
αββ
αx
αβ11
h1 + x
αβi2,(1)
FLT (x|α, β;xL) = x
αβxL
αβ
1 + x
αβ.(2)
The subscript “LT” stands for left-truncated.If a random variable Xis log-logistically distributed with
positive parameters α, β and left-truncation xLwe can denote XLL(α, β;xL). From Eq. (2) we see
immediately how to generate a random variable XLL(α, β;xL) from a uniformly distributed random
variable Uin the interval (0,1), namely
X= αβU+xβ
L
1U!1
=αU+η
1U1
,(3)
where the η= (xL)β.
When probability distributions are truncated some interesting effects can happen: in [9] it is demon-
strated that if a normal distribution is truncated, there exists finite random samples for which the regular
maximum likelihood equations (MLE) do not possess a solution. Instead a new maximum likelihood es-
timator as a limit case was obtained leading to a degenerated one-parameter distribution, namely the
exponential distribution“to fit” the sample data appropriately. Similar effects were observed in the ana-
lysis of the left-truncated Weibull distributions [10], where for certain samples the MLE do not possess
a solution. Indeed, in [8] it is found numerically that for a rather large set of data samples the MLE for
the left-truncated log-logistic distribution do not possess a solution.
This was one of the main motivations for this paper, because it demonstrates a clear need for a careful
analysis of the left-truncated log-logistic distribution from a rigorous mathematical point of view. To the
best of our knowledge there was no study to prove the existence of the maximum likelihood estimator
for a random sample drawn from the left-truncated log-logistic distribution. Therefore any numerical
studies without this proof, assuming the existence of a solution, will end up in the worst case either not
converging at all, or converging to a degenerate solution in which the parameter estimates can take values
like zero or infinity.
Our paper is structured as follows: Section 2 contains the main theorems for the existence of a non-
trivial solution of the maximum likelihood equations for the left-truncated log-logistic distribution. It
also examines the properties of a suitable profile likelihood function. These results will be relevant for
the efficient numerical implementation explained in section 3. Moreover we also discuss in this section
our findings for the critical values for Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) hypothesis
tests based on extensive Monte Carlos simulations at the supercomputer of Adelaide University. As an
illustration, in Section 4 we apply our technique to cancer data and German precipitation data. All
proofs are given in Section 5.
2 Mathematical results
2.1 Scaling property
We first provide a lemma dealing with a scaling property.
Lemma 1. (Scaling Property) Let XLL(α, β;xL)be a left-truncated log-logistic random variable.
Then for any k > 0we have kX LL(kα, β;kxL).
Proof:
By assumption X > xLwith XLL(α, β;xL). Thus kX > kxLand
prob(kX < x |kX > kxL) = prob(X < x/k|X > xL)
=x/k
αβxL
αβ
1 + x/k
αβ=x
βkxL
β
1 + x
β
2
and the proof is finished.
As a consequence, for simplicity we will assume later on the left-truncation point xL= 1. In other
words we rescale the independent identically distributed (i.i.d.) sample X1, ..., XNwith k= 1/xL, leading
to X1/xL, ..., XN/xL(which is truncated at 1). In any stage we can go back to the original sample and
original parameters by using a rescaling factor k=xL.
2.2 The first-order maximum likelihood equations
We introduce a new parametrization with λ=αβ, and re-write Eq. (1) as
fLT (x|λ, β) = 1 + xβ
L
λ!β
λxβ11
h1 + xβ
λi2.(4)
Note that with this notation we have η=xβ
L.
Using Eq. (4) and a left-truncated log-logistic sample, denoted by X1, X2, ..., XNand all observations
are bigger than xL>0, the log-likelihood function is given via by
ln LLT ({Xi}|λ, β;xL) = Nln 1 + xβ
L
λ!+Nln βNln λ
+(β1)
N
X
i=1
ln Xi2
N
X
i=1
ln "1 + Xβ
i
λ#.(5)
The MLE equations are obtained by differentiating Eq. (5) with respect to λand βrespectively and
putting the derivatives equals to zero. From
λ ln LLT = 0 we obtain for (λ, β)R+×R+
0 = 2
N
N
X
i=1
Xβ
i
λ
1 + Xβ
i
λ1
xβ
L
λ
1 + xβ
L
λ
(6)
and from
β ln LLT = 0 we obtain for (λ, β)R+×R+
0 =
xβ
L
λln xL
1 + xβ
L
λ
+1
β+1
N
N
X
i=1
ln Xi2
N
N
X
i=1
Xβ
i
λ
1 + Xβ
i
λ
ln Xi.(7)
A solution of these MLE equations Eqs. (6)–(7) (if it exists) will be denoted by (ˆ
λ, ˆ
β). For finite untrun-
cated samples, Ref. [11] (respectively Ref. [12]) has shown that the MLE have a unique solution for the
log-logistic distribution (respectively the logistic distribution). These proofs fail when a left-truncation
point xL>0 is introduced.
We will take xL= 1 from now on without loss of generality. We also define for convenience the
quantity S=
N
P
i=1
ln Xiand introduce a new objective function as in [11]
ϕ(λ, β) = ln LLT ({Xi}|λ, β;xL= 1) + S .
The extrema of both the function ϕ(·,·) and the log-likelihood function are the same (and so are their
MLE equations), because they differ by a constant number, S.
2.3 Our main theorems
Our first theorem states the existence of a maximum for the objective function under certain conditions.
Theorem 1. (Existence) Consider the i.i.d. left-truncated sample X1, ..., XN>1, for which at least two
observations are different, and define the objective function ϕ(·,·) : R+×R+Rby
ϕ(λ, β) = Nln 1 + 1
λ+Nln βNln λ+βS
2
N
X
i=1
ln "1 + Xβ
i
λ#.(8)
3
Define βC>0as the unique solution of
1
2=1
N
N
X
i=1
1
XβC
i
(9)
and β0>0by
1
β0
=1
N
N
X
i=1
ln Xi.(10)
Then the following holds true
(1) For β0> βCthe objective function ϕ(·,·)posseses a global maximum (ˆ
λ, ˆ
β)R+×R+.
(2) For β0βCthe objective function ϕ(·,·)posseses a (local) maximum (ˆ
λ, ˆ
β) = (0, β0)on the boundary.
The proof is given in Section 5.
The next result is important for the numerical computation of the maxima and leads to a profile
likelihood function and provides a curve for the loci1of critical points.
Theorem 2. (Loci of critical points)
Consider the i.i.d. left-truncated sample X1, ..., XN>1for which at least two observations are
different. Then the following holds true:
(1) For fixed β > βCthe equation
λ ϕ(λ, β)λ=ˆ
λ
= 0
has exactly one positive solution ˆ
λwhich depends on the fixed parameter β. Furthermore we have the
following inequalities
λ ϕ(λ, β)>0for 0<λ<ˆ
λ
λ ϕ(λ, β)<0for λ > ˆ
λ
and thus
2
λ2ϕ(λ, β)λ=ˆ
λ
<0.(11)
(2) For fixed 0< β βCthe equation
λ ϕ(λ, β)λ=ˆ
λ
= 0 ,
has the only solution ˆ
λ= 0.
Hence the non-negative continous function Λ(·) : R+R+
0, defined by
Λ(β) = 0,0< β βC
ˆ
λ, β > βC
(12)
is the locus of all possible critical points of the objective function ϕ(·,·).
1locus = a set of points that satisfy or are determined by some specific condition
4
The proof is given in Section 5.
By inserting the function Λ(β), as constructed in Eq. (12), into the original objective function Eq. (8)
we obtain the “profile likelihood function”
˜ϕ(β) = ϕ(Λ(β), β) =
Nln ββS, β (0, βC]
Nln 1 + 1
Λ(β)+Nln βNln Λ(β)
+βS 2
N
P
i=1
ln 1 + Xβ
i
Λ(β), β (βC,)
(13)
Therefore, we have reduced the two-dimensional maximization problem for ϕ(·,·) on a finite region to
a one-dimensional problem for ˜ϕ(·) on a finite interval. Figure 1 illustrates the problem: the critical points
(and therefore our maximum guaranteed by Theorem 1) are located inside the blue rectangle region and
must be located on the red curve. By condition Eq. (11), critical points can only be maxima or saddle
points. This simplification is important because it improves the speed of finding the critical points in our
numerical study dramatically. Unfortunately, we were not able to prove uniqueness for the critical points,
as the necessary concavity arguments for the Hessian matrix seem intractable. Numerically, though, we
always found exactly one critical point for millions of samples generated and this point was always a
maximum.
Figure 1: The function Λ(β) for the profile likelihood.
A Corollary sumarises the situation of our two theorems.
Corollary 1. Under the assumptions of Theorem 1. and 2. the following holds true
(1) If and only if β0> βCthe MLE equations Eq. (6)–(7) possess a solution (ˆ
λ, ˆ
β)6= (0,0) (which might
not be unique). Critical points are obtained by constructing the function Λ(·) : R+R+
0as defined
for all β > 0in Theorem 2, Eq. (12), and then solving Eq. (7), which now reads
0 = N
β+
N
X
i=1
ln Xi2
N
X
i=1
Xβ
iln Xi
Λ(β) + Xβ
i
,(14)
5
摘要:

Maximumlikelihoodestimationforleft-truncatedlog-logisticdistributionswithagiventruncationpointMarkusKreer1,AyseKzlersu2,JakeGuscott2,LukasChristopherSchmitz3,andAnthonyW.Thomas21Feldbergschule,OberhochstadterStr.20,61440Oberursel(Taunus),Germany.2CSSM,DepartmentofPhysics,UniversityofAdelaide,50...

展开>> 收起<<
Maximum likelihood estimation for left-truncated log-logistic distributions with a given truncation point Markus Kreer1 Ay se Kzlers u2 Jake Guscott2 Lukas Christopher Schmitz3 and.pdf

共27页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:27 页 大小:541.36KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 27
客服
关注