Simulation studies to compare bayesian wavelet shrinkage methods in aggregated functional data Alex Rodrigo dos S. Sousa

2025-05-03 0 0 521.9KB 23 页 10玖币
侵权投诉
Simulation studies to compare bayesian wavelet shrinkage methods
in aggregated functional data
Alex Rodrigo dos S. Sousa
State University of Campinas, Brazil
Abstract
The present work describes simulation studies to compare the performances of bayesian wavelet shrink-
age methods in estimating component curves from aggregated functional data. To do so, five methods
were considered: the bayesian shrinkage rule under logistic prior by Sousa (2020), bayesian shrinkage
rule under beta prior by Sousa et al. (2020), Large Posterior Mode method by Cutillo et al. (2008),
Amplitude-scale invariant Bayes Estimator by Figueiredo and Nowak (2001) and Bayesian Adaptive Mul-
tiresolution Smoother by Vidakovic and Ruggeri (2001). Further, the so called Donoho-Johnstone test
functions, Logit and SpaHet functions were considered as component functions. It was observed that the
signal to noise ratio of the data had impact on the performances of the methods.
1 Introduction
The statistical problem of estimating component curves from aggregated curves, also known as calibration
problem, has been widely studied in recent years in several areas of science. In Chemometrics, for example,
there is an interest in estimating absorbance curves of the constituents of a substance from samples of
absorbance curves of the substance itself. In this case, the substance’s absorbance curve is formed by the
linear combination of the absorbance curves of its constituents, according to the Beer-Lambert Law (Brereton,
2003). Another example appears in the study of electricity consumption in a given region, in which the energy
consumption curve over a period of time is composed of the aggregations of the individual consumption curves
of households and establishments (Dias et al., 2013).
The first proposed methods to estimate component curves from aggregated data approach this problem in a
multivariate way, since, in practice, we observe such curves in a finite number of locations. Thus, it is possible
to see the observations as a random vector with a certain correlation structure between nearby locations.
Methods based on Principal Components Regression (PCR) by Cowe and McNicol (1985) and Partial Least
Squares Regression (PLS) by Wold et al. (1983) have been successfully proposed in applications. Bayesian
multivariate approaches was also proposed by Brown et al. (1998a,b) and Brown et al. (2001).
Methods based on functional data analysis to the calibration problem were proposed later by Dias et al.
1
arXiv:2210.04966v1 [stat.ME] 10 Oct 2022
(2009) and Dias et al. (2013), taking into account the functional structure of the observations. In this way,
each component curve can be represented in terms of some convenient function basis, such as splines basis for
example, so that the problem of estimating the curve becomes a finite-dimensional problem of estimating the
coefficients of the basis expansion. In this functional approach, one possibility is to expand the functions in
wavelet basis, recently proposed by Sousa (2022). An advantage of such an approach is the well localization
property of the wavelet coefficients and the possibility of estimating important characteristics of a function,
such as peaks, oscillations, discontinuities, among others, through wavelet coefficients. Further, the sparsity of
the wavelet coefficients allow to identify these function features by the magnitude of the nonzero coefficients
at their localizations. See Vidakovic (1999) for details and properties of wavelets and its application on
statistical modelling.
Sousa (2022) proposes the expansion of the component curves by wavelet basis and, to estimate the
wavelet coefficients, the use of a bayesian approach by considering, for each wavelet coefficient, a prior
distribution composed of a point mass function at zero and the logistic distribution. Thus, the Bayesian rule
for estimating coefficients under quadratic loss function assumption is the posterior expected value of the
wavelet coefficient. In fact, the associated rule acts by shrinking empirical wavelet coefficients and reducing
the present noise effects on the coefficients. This kind of estimator is also called shrinkage rules, see Donoho
and Johnstone (1994a,b and 1995) and Donoho (1993a,b and 1995a,b) for details about wavelet shrinkage
procedures. Further, in the work by Sousa (2022), a simulation study was carried out comparing the proposed
method with the expansion of the component curves by B-splines basis. It was concluded that the proposed
method performed better in terms of mean squared error than the method based on B-splines for curves with
characteristics such as discontinuities, peaks and oscillations.
Although the study by Sousa (2022) indicates the feasibility of applying the wavelet basis expansion and
the use of the Bayesian shrinkage rule under logistic prior to estimate the wavelet coefficients of component
curves from aggregated curves, it would be interesting a study that compares different wavelet shrinkage
methods in estimating component curves. In this sense, this work intends to compare bayesian methods of
wavelet shrinkage in the problem of estimating component curves from aggregated curves. For that, five
methods were considered, the rule based on the logistic prior of Sousa (2020 and 2022), the rule based on the
symmetric around zero beta prior proposed by Sousa et al. (2020), the Large Posterior Mode (LPM) method
by Cutillo et al. (2008), Amplitude-scale invariant Bayes Estimator (ABE) by Figueiredo and Nowak (2001)
and Bayesian Adaptive Multiresolution Smoother (BAMS) by Vidakovic and Ruggeri (2001).
This paper is organized as follows: the statistical model for de aggregated curves and the estimation
procedure of the component curves are defined in Section 2. The descriptions of the considered bayesian
methods in the simulation studies are in Section 3. Section 4 is dedicated to the simulation studies results
and analysis. We finish with further considerations and discussions in Section 5.
2
2 Statistical model and estimation procedure
We consider a univariate function A(t)L2(R) = {f:RR|Rf2<∞} that can be written as
A(t) =
L
X
l=1
ylαl(t) + e(t),(2.1)
where αl(t)L2(R) are unknown component functions, ylare known real valued weights, l= 1,··· , L,
and {e(t), t R}is a zero mean gaussian process with unknown variance σ2,σ > 0. The estimation of the
functions αl(t) is considered in this work. In fact, this estimation process is usually done by multivariate
methods (Brown et al., 1998a,b) or functional data analysis (FDA) approach (Dias et al., 2009, 2013). In this
last point of view, each function αlof (2.1) is represented in terms of some functional basis such as splines,
B-splines or Fourier basis, for example. Here, we expand each component function by wavelet basis,
αl(t) = X
j,kZ
γ(l)
jk ψjk(t), l = 1,··· , L, (2.2)
where {ψjk(x) = 2j/2ψ(2jxk), j, k Z}is an orthonormal wavelet basis for L2(R) constructed by dilations
jand translations kof a function ψcalled wavelet or mother wavelet and γ(l)
jk ’s are unknown wavelet
coefficients of the expansion of the component function αl. Note that we consider the same wavelet family for
all component functions expansion. Thus, the problem of estimating the functions αlbecomes a problem of
estimating the finite number of wavelet coefficients γ(l)
jk ’s of the representation (2.2). Further, the magnitude
of the wavelet coefficients allow to recover local features of the component functions, such as discontinuities
points, spikes and oscillations, due the well localization in time and frequency domain of wavelets. This
characteristic does not occur in spline-based and Fourier basis representations.
In practice, one observes Isamples of the aggregated curve A(t) at M= 2Jequally spaced locations
t1,··· , tM, i.e, our dataset is {(tm, Ai(tm)), m= 1,··· , M and i= 1,··· , I}. Thus, the discrete version of
(2.1) is
Ai(tm) =
L
X
l=1
yilαl(tm) + ei(tm), i = 1,··· , I, m = 1,··· , M = 2J,(2.3)
where ei(tm) are independent and identically normal distributed random noise with zero mean and variance
σ2,i, m. Further, the Isamples are obtained at the same locations t1,··· , tMbut the weights of the linear
combinations are allowed to be different from one sample to another. We can rewrite (2.3) in matrix notation
as
A=αy +e,(2.4)
where A= (Ami =Ai(tm))1mM,1iI,α= (αml =αl(tm))1mM,1lL,y= (yli)1lL,1iIand
e= (emi =ei(tm))1mM,1iI.
The wavelet shrinkage procedure will be made in the wavelet domain to estimate the coefficients γ’s of
(2.2). For this reason, we apply a discrete wavelet transform (DWT) on the original aggregated data, which
3
can be represented by a M×Mwavelet transformation matrix W, and applied on both sides of (2.4), i.e,
W A =W(αy +e)
W A =W αy +W e
D=Γy+ε,(2.5)
where D=W A = (dmi)1mM,1iIis the matrix with the empirical wavelet coefficients of the aggregated
curves, Γ=W α = (γml)1mM,1lLis the matrix with the unknown wavelet coefficients of the component
curves and ε=W e = (εmi)1mM,1iIis the matrix with the random errors on the wavelet domain, which
remain zero mean normal distributed with variance σ2due the orthogonality property of wavelet transforms.
Thus, for a particular empirical wavelet coefficient dmi of D, one has the additive model
dmi =
L
X
l=1
yliγml +εmi =θmi +εmi,(2.6)
where θmi =PL
l=1 yliγml and εmi is zero mean normal with variance σ2, i.e, a single empirical wavelet
coefficient of the aggregated curve is also a linear combination of the unknown wavelet coefficients of the
component curves plus a random error. Moreover, the weights of this linear combination on wavelet domain
remain the same that the original combination of the curves at time domain.
The estimation of the wavelet coefficients matrix Γin (2.5) is done by applying a wavelet shrinkage rule
δon each single empirical wavelet coefficient d, obtaining the matrix δ(D)such that
δ(D)= (δ(dmi))1mM,1iI.(2.7)
We can see the matrix δ(D)as a denoising version of D, i.e, the shrinkage rule δ(d) acts by denoising
the empirical coefficient din order to estimate θin (2.6), δ(d) = ˆ
θ. Thus, the estimation ˆ
Γof the wavelet
coefficients matrix Γis given by least squares method,
ˆ
Γ=δ(D)yt(yyt)1,(2.8)
and finally αcan be estimated at locations t1,··· , tMby the inverse discrete wavelet transformation (IDWT),
ˆα=Wtˆ
Γ.(2.9)
For more details about wavelet shrinkage in aggregated curves, see Sousa (2022).
3 Bayesian methods
There are several available wavelet shrinkage methods in the literature. Most of them are thresholding,
i.e, the rule shrinks sufficient small empirical wavelet coefficients as exactly zero. Two extremely applied
thresholding rules are the so called hard and soft rules proposed by Donoho and Johnstone (1994b). In this
work will be considered bayesian wavelet shrinkage methods that have been proposed in recent years. In
4
general, these methods propose a prior distribution to the wavelet coefficients and estimate them according
to a loss function, as quadratic loss function for example, by the Bayes rule. The main advantage of bayesian
methods is the ability to incorporate prior information about the wavelet coefficients, as sparsity, boundedness,
self similarity and others by convenient choices of prior distributions and their hyperparameters values.
In the next subsections, we briefly describe the considered bayesian shrinkage rules in this work.
3.1 Shrinkage rule under logistic prior
The shrinkage rule under logistic prior was proposed by Sousa (2020). It assumes a mixture of a point
mass function at zero and a symmetric logistic distribution as prior to a single linear combination of wavelet
coefficients θ,
π(θ;p, τ) = 0(θ) + (1 p)g(θ;τ),(3.1)
where p(0,1), δ0(θ) is the point mass function at zero and g(θ;τ) is the logistic density function symmetric
around zero, for τ > 0,
g(θ;τ) = exp{−θ
τ}
τ(1 + exp{−θ
τ})2IR(θ).(3.2)
Under squared loss function, the associated bayesian shrinkage rule is the posterior expected value of θ,
Eπ(θ|d), that under the model prior (3.1), is given by (Sousa, 2020),
δ(d) = Eπ(θ|d) = (1 p)RR(σu +d)g(σu +d;τ)φ(u)du
p
σφ(d
σ) + (1 p)RRg(σu +d;τ)φ(u)du,(3.3)
where φ(·) is the standard normal density function. The shrinkage rule (3.3) under the model (3.2) is called
logistic shrinkage rule and has interesting features under estimation point of view. First, its hyperparameters
pand τcontrol the degree of shrinkage of the rule. Higher values of τor pimply higuer shrinkage level, i.e, the
rule will reduce severely the magnitudes of the empirical coefficients. Further, as described in Sousa (2020),
the logistic shrinkage rule had good performances in terms of averaged mean squared error in simulation
studies against standard shrinkage or thresholding procedures.
3.2 Shrinkage rule under beta prior
Sousa et al. (2020) proposed the use of a mixture of a point mass function at zero and the beta distribution
with symmetric support around zero as a prior distribution to the wavelet coefficients,
π(θ;p, a, m) = 0(θ) + (1 p)g(θ;a, m),(3.4)
and
g(θ;a, m) = (m2θ2)(a1)
(2m)(2a1)B(a, a)I[m,m](θ),(3.5)
where B(·,·) is the standard beta function, a > 0 and m > 0 are the parameters of the distribution, and
I[m,m](·) is an indicator function equal to 1 for its argument in the interval [m, m] and 0 else.
5
摘要:

SimulationstudiestocomparebayesianwaveletshrinkagemethodsinaggregatedfunctionaldataAlexRodrigodosS.SousaStateUniversityofCampinas,BrazilAbstractThepresentworkdescribessimulationstudiestocomparetheperformancesofbayesianwaveletshrink-agemethodsinestimatingcomponentcurvesfromaggregatedfunctionaldata.To...

展开>> 收起<<
Simulation studies to compare bayesian wavelet shrinkage methods in aggregated functional data Alex Rodrigo dos S. Sousa.pdf

共23页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:23 页 大小:521.9KB 格式:PDF 时间:2025-05-03

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 23
客服
关注