An Ecient Workow for Modelling High-Dimensional Spatial Extremes Silius M. Vandeskog_2

2025-04-30 0 0 1014.28KB 27 页 10玖币
侵权投诉
An Efficient Workflow for Modelling
High-Dimensional Spatial Extremes
Silius M. Vandeskog
Department of Mathematics, The Norwegian University of Science and Technology (NTNU)
and
Sara Martino
Department of Mathematics, The Norwegian University of Science and Technology (NTNU)
and
Rapha¨el Huser
Statistics program, CEMSE Division, King Abdullah University of Science and Technology (KAUST)
December 14, 2022
Abstract
A successful model for high-dimensional spatial extremes should, in principle, be
able to describe both weakening extremal dependence at increasing levels and changes
in the type of extremal dependence class as a function of the distance between locations.
Furthermore, the model should allow for computationally tractable inference using
inference methods that efficiently extract information from data and that are robust to
model misspecification. In this paper, we demonstrate how to fulfil all these requirements
by developing a comprehensive methodological workflow for efficient Bayesian modelling
of high-dimensional spatial extremes using the spatial conditional extremes model
while performing fast inference with
R-INLA
. We then propose a post hoc adjustment
method that results in more robust inference by properly accounting for possible model
misspecification. The developed methodology is applied for modelling extreme hourly
precipitation from high-resolution radar data in Norway. Inference is computationally
efficient, and the resulting model fit successfully captures the main trends in the extremal
dependence structure of the data. Robustifying the model fit by adjusting for possible
misspecification further improves model performance.
Keywords: Spatial conditional extremes, Robust Bayesian inference, Computational statistics,
R-INLA
1
arXiv:2210.00760v2 [stat.ME] 13 Dec 2022
1 Introduction
The effects of climate change and the increasing availability of large and high-quality data
sets has lead to a surge of research on the modelling of spatial extremes (e.g., Castro-Camilo
et al., 2019; Koch et al., 2021; Koh et al., 2021; Opitz et al., 2018; Richards et al., 2022;
Shooter et al., 2019; E. S. Simpson & Wadsworth, 2021; Vandeskog et al., 2022). Modelling
spatial extremes is challenging for two main reasons: 1) classical models are often not flexible
enough to provide realistic descriptions of extremal dependence, and 2) inference can be
computationally demanding or intractable, so modellers must often rely on less efficient
inference methods; see Huser and Wadsworth (2022) for a review of these challenges. In this
paper, we propose a comprehensive methodological workflow, as well as practical strategies,
on how to perform efficient and flexible high-dimensional modelling of spatial extremes.
An important component of spatial extreme value theory is the characterisation of a
spatial process’ asymptotic dependence properties (e.g., Coles et al., 1999). Two random
variables with a positive limiting probability to experience their extremes simultaneously are
denoted asymptotically dependent. Otherwise, they are denoted asymptotically independent.
As demonstrated by Sibuya et al. (1960), two asymptotically independent random variables
may still be highly correlated and thus exhibit large amounts of so-called sub-asymptotic
dependence. Thus, correct estimation of both asymptotic and sub-asymptotic dependence
properties is of utmost importance when assessing the risks of spatial extremes.
Most classical models for spatial extremes are based on max-stable processes (Davison
et al., 2012; Davison et al., 2019). These allow for rich modelling of asymptotic dependence,
but are often too rigid in their descriptions of asymptotic independence and sub-asymptotic
dependence. Other approaches have been proposed, such as scale-mixture models (Engelke
et al., 2019; Huser & Wadsworth, 2019), which allow for rich modelling of both asymptotic
dependence and independence, and a more flexible description of sub-asymptotic dependence.
However, these models require that all location pairs share the same asymptotic dependence
class, which is problematic as one would expect neighbouring locations to be asymptotically
dependent and far-away locations to be asymptotically independent. Max-mixture model
(Wadsworth & Tawn, 2012) allow for even more flexible modelling of sub-asymptotic depen-
dence, and for changing the asymptotic dependence class as a function of distance. However,
it is often difficult to estimate the key model parameter, which describes the transition
between extremal dependence classes. Additionally, these models must often rely on less
efficient inference methods. Further improvements are given by the kernel convolution model
of Krupskii and Huser (2022), more recent scale-mixture models such as that of Hazra et al.
(2021), and the spatial conditional extremes model of Wadsworth and Tawn (2022), which
all allows for flexible modelling of different extremal dependence classes as a function of
distance. The spatial conditional extremes model allows for a particularly simple way of
modelling spatial extremes. It is based on the conditional extremes model of Heffernan and
Resnick (2007) and Heffernan and Tawn (2004), which describes the behaviour of a random
vector conditional on one of its components being extreme, and it can be interpreted as a
2
semi-parametric regression model, which makes it intuitive and simple to tailor or extend. Due
to its high flexibility and conceptual simplicity, this is our chosen model for high-dimensional
spatial extremes.
To make the spatial conditional extremes model computationally efficient in higher
dimensions, Wadsworth and Tawn (2022) propose to model spatial dependence using a
residual random process constructed from a Gaussian copula and delta-Laplace marginal
distributions. However, inference for Gaussian processes typically requires computing the
inverse of the covariance matrix, whose cost scales cubicly with the model dimension. Thus,
E. S. Simpson et al. (2020), propose to exchange the delta-Laplace process with a Gaussian
Markov random field (Rue & Held, 2005) created using the so-called stochastic partial
differential equations (SPDE) approach of Lindgren et al. (2011). Furthermore, in order
to perform spatial high-dimensional Bayesian inference, E. S. Simpson et al. (2020) modify
the spatial conditional extremes model into a latent Gaussian model, which allows for
performing inference using the integrated nested Laplace approximation (INLA; Rue et al.,
2009), implemented in the
R-INLA
software (Rue et al., 2017). This allows for a considerable
improvement in the Bayesian modelling of high-dimensional spatial extremes. However,
there is still much room for improvement. In this paper, we thus build upon the modelling
framework of E. S. Simpson et al. (2020) and develop a more general methodology for
modelling spatial conditional extremes with
R-INLA
. We also point out a theoretical weakness
in the constraining methods proposed by Wadsworth and Tawn (2022) and used by E. S.
Simpson et al. (2020), and we demonstrate a computationally efficient way of fixing it.
As most statistical models for extremes are based on asymptotic arguments and as-
sumptions, a certain degree of misspecification will always be present when modelling finite
amounts of data. Additionally, model choices made for reasons of computational efficiency,
such as adding Markov assumptions to a spatial random field, may lead to further mis-
specification. This complicates Bayesian inference and can result in misleading posterior
distributions (Kleijn & van der Vaart, 2012; Ribatet et al., 2012). One should therefore strive
to make inference more robust towards misspecification when modelling high-dimensional
spatial extremes. Shaby (2014) proposes a method for more robust inference through a post
hoc transformation of posterior samples created using Markov chain Monte Carlo (MCMC)
methods. Here, we develop a refined version of his adjustment method, and we use it for
performing more robust inference with R-INLA.
As extreme behaviour is, by definition, rare, inference with the conditional extremes model
often relies on a composite likelihood that combines data from different conditioning sites
under the working assumption of independence (Heffernan & Tawn, 2004; Richards et al.,
2022; E. S. Simpson & Wadsworth, 2021; Wadsworth & Tawn, 2022). However, composite
likelihoods can lead to large amounts of misspecification (Ribatet et al., 2012), and E. S.
Simpson et al. (2020) thus abstain from using a composite likelihood to avoid the problems
that occur when performing Bayesian inference with a composite likelihood using
R-INLA
.
We show that the post hoc adjustment method accounts for the misspecification from the
composite likelihood, thus allowing for more efficient inference using considerably more data.
3
To sum up, in this paper we develop a general workflow for performing high-dimensional
modelling of spatial extremes using the spatial conditional extremes model. We improve
upon the work of E. S. Simpson et al. (2020) by developing a more general, flexible and
computationally efficient methodology for modelling spatial conditional extremes with
R-INLA
and the SPDE approach. Then, we make inference more robust towards misspecification
by extending the post hoc adjustment method of Shaby (2014), and we further apply this
adjustment method for more efficient inference by combining information from multiple
conditioning sites.
The remainder of the paper is organised as follows: In Section 2, the spatial conditional
extremes model is presented as a flexible choice for modelling spatial extremes. Modifications
and assumptions that allow for computationally efficient inference with improved data
utilisation are also presented. Then, in Section 3, we develop a general methodology for
implementing a large variety of spatial conditional extremes models in
R-INLA
. Section 4
examines the problems that can occur when performing Bayesian inference based on a
misspecified likelihood, and demonstrates how to perform more robust inference with
R-INLA
by accounting for possible misspecification. In Section 5, a simulation study is presented
where we demonstrate and validate our entire workflow for high-dimensional modelling of
spatial extremes. Then, in Section 6, our proposed workflow is applied for modelling extreme
hourly precipitation from high-resolution radar data in Norway. Finally, we conclude in
Section 7with some discussion and perspectives on future research.
2 Flexible modelling with spatial conditional extremes
2.1 The spatial conditional extremes model
Let
Y
(
s
) be a random process defined over space (
s∈ S R2
) with Laplace margins. For
this random process, Wadsworth and Tawn (2022) assume the existence of standardising
functions a(s;s0, y0) and b(s;s0, y0) such that, for a large enough threshold t,
[Y(s)|Y(s0) = y0> t]d
=a(s;s0, y0) + b(s;s0, y0)Z(s;s0),s,s0∈ S,(1)
where
Z
(
s
;
s0
) is a random process satisfying
Z
(
s0
;
s0
) = 0 almost surely, and
a
(
s
;
s0, y0
)
y0
,
with equality when
s
=
s0
. The degree of asymptotic dependence may be measured through
the extremal correlation coefficient
χ(s1,s2) = lim
p1χp(s1,s2) = lim
p1P(Y(s1)> F 1
Y(p)|Y(s2)> F 1
Y(p)),
where
F1
Y
(
p
) is the marginal quantile function of the process
Y
(
s
). If
χ
(
s1,s2
)
>
0, then
Y
(
s1
)
and
Y
(
s2
) are asymptotically dependent, whereas if
χ
(
s1,s2
) = 0, they are asymptotically
independent. It is known that
Y
(
s
) and
Y
(
s0
), defined in
(1)
, are asymptotically dependent
when
a
(
s
;
s0, y0
) =
y0
and
b
(
s
;
s0, y0
) = 1, while they are asymptotically independent when
4
a
(
s
;
s0, y0
)
< y0
(Heffernan & Tawn, 2004). However, under asymptotic independence, the
convergence of χp(·) to χ(·) is slower for larger values of a(·) and b(·).
Wadsworth and Tawn (2022) provide some guidance on parametric functions for
a
(
·
)
and
b
(
·
) together with parametric distributions for
Z
(
·
) that cover a large range of already
existing models. For modelling a(·) they specifically propose the parametric function
a(s;s0, y0) = y0α(kss0k) = y0exp {−[max(0,kss0k − ∆)a]κa},(2)
with parameters ∆
0 and
λa, κa>
0. This function yields a model with asymptotic
dependence for locations closer to the conditioning site than a distance ∆, while displaying
asymptotic independence for distances larger than ∆, with a weakening sub-asymptotic
dependence as we move further away from
s0
. To the best of our knowledge, this model (and
its sub-models) has been adopted by a majority of spatial conditional extremes modellers.
Several forms are proposed for
b
(
·
), including the form
b
(
s
;
s0, y0
) =
yβ
0
, when ∆ = 0. This
allows for modelling asymptotic independence with positive dependence, with the
β
parameter
helping to control the speed of convergence of
χp
(
s1,s2
) to
χ
(
s1,s2
). A weakness of this form
is that it enforces the same positive dependence for all distances, including large distances
where the observations should be independent of
Y
(
s0
). To remedy this issue, Wadsworth
and Tawn (2022) also propose the model
b
(
s
;
s0, y0
) = 1 +
a
(
s
;
s0, y0
)
β
, which converges to
one as the distance increases. Alternatively, Shooter, Tawn, et al. (2021) and Richards et al.
(2022) have proposed different models on the form
b
(
s
;
s0, y0
) =
yβ(kss0k)
0
, where they let the
function β(d) converge to zero as the distance d→ ∞.
Clearly, the best model for the standardising functions
a
(
·
) and
b
(
·
) depends on the
application. Therefore, in Section 3, we develop a general methodology for implementing
the conditional spatial extremes model in
R-INLA
for any kind of functions
a
(
·
) and
b
(
·
). In
addition, we provide practical guidance and diagnostics for selecting appropriate standardising
functions in our simulation study in Section 5and data application in Section 6.
2.2 Modifications for high-dimensional modelling
To perform high-dimensional inference, Wadsworth and Tawn (2022) propose to model
Z
(
·
)
as a random process with a Gaussian copula and delta-Laplace marginal distributions. Their
proposed model for
Z
(
·
) has later seen usage by, e.g., Shooter, Ross, Ribal, et al. (2021),
Shooter et al. (2022), and Shooter, Tawn, et al. (2021) and Richards et al. (2022). However,
in order to perform Bayesian inference with
R-INLA
, E. S. Simpson et al. (2020) modify
(1)
into a latent Gaussian model by adding a Gaussian nugget effect and requiring
Z
(
·
) to be a
fully Gaussian random field. This gives the model
[Y(s)|Y(s0) = y0> t]d
=a(s;s0, y0) + b(s;s0, y0)Z(s;s0) + (s;s0),(3)
where
(
s
;
s0
) is Gaussian white noise with constant variance, satisfying
(
s0
;
s0
) = 0 almost
surely. They further assume that
Z
(
·
) has zero mean and a Mat´ern covariance structure,
5
摘要:

AnEcientWorkowforModellingHigh-DimensionalSpatialExtremesSiliusM.VandeskogDepartmentofMathematics,TheNorwegianUniversityofScienceandTechnology(NTNU)andSaraMartinoDepartmentofMathematics,TheNorwegianUniversityofScienceandTechnology(NTNU)andRaphaelHuserStatisticsprogram,CEMSEDivision,KingAbdullahUni...

展开>> 收起<<
An Ecient Workow for Modelling High-Dimensional Spatial Extremes Silius M. Vandeskog_2.pdf

共27页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:27 页 大小:1014.28KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 27
客服
关注