Vecchia Approximations and Optimization for Multivariate Mat ern Models Youssef Fahmy Joseph Guinness

2025-05-06 0 0 2.66MB 16 页 10玖币
侵权投诉
Vecchia Approximations and Optimization for
Multivariate Mat´ern Models
Youssef Fahmy, Joseph Guinness
Cornell University, Department of Statistics and Data Science
Abstract
We describe our implementation of the multivariate Mat´ern model for multivariate spatial datasets,
using Vecchia’s approximation and a Fisher scoring optimization algorithm. We consider various
pararameterizations for the multivariate Mat´ern that have been proposed in the literature for en-
suring model validity, as well as an unconstrained model. A strength of our study is that the code
is tested on many real-world multivariate spatial datasets. We use it to study the effect of ordering
and conditioning in Vecchia’s approximation and the restrictions imposed by the various parame-
terizations. We also consider a model in which co-located nuggets are correlated across components
and find that forcing this cross-component nugget correlation to be zero can have a serious impact
on the other model parameters, so we suggest allowing cross-component correlation in co-located
nugget terms.
1 Introduction
We attempt to understand the complexities of the Earth system by measuring and modeling many
variables that interact on a continuum of spatial-temporal scales. For example, modern climate
models include dozens of variables evolving in concert over space and time. In this paper, we
analyze elemental data from a soil monitoring network in France (Saby et al., 2009), elemental data
from a well-water monitoring program in Bangladesh (Kinniburgh and Smedley, 2001), multispectral
data from the GOES-16 satellite (maintained by NASA and NOAA), and the difference between
forecasted and actual pressures and temperatures in the Pacific Northwest (Eckel and Mass, 2005).
Gaussian processes have become a workhorse for statistical analysis of spatial and spatial-
temporal data. A multivariate Gaussian process with p-components is a random vector func-
tion Z(x)=(Z1(x), . . . , Zp(x))>Rpindexed by a set DRd, d 1,such that for any
x1, . . . , xnD, the random vector (Z(x1)>, . . . , Z(xn)>)>Rnp has a multivariate normal distri-
bution. The process is completely specified by its mean function E[Z(x)] and covariance function
{Cov(Zi(x), Zj(x0))}p
i,j=1. When the covariances depend only on the separation lag h–that is, the
process is covariance-stationary–we write Cij (h) for Cov(Zi(x+h), Zj(x)), which is referred to as a
cross covariance function. Genton and Kleiber (2015) provide a thorough review of cross covariance
functions.
Following the popularity and success of the Mat´ern model for univariate spatial data, a mul-
tivariate version has been proposed and studied (Gneiting et al., 2010; Apanasovich et al., 2012;
Emery et al., 2022). The general form of the multivariate Mat´ern model is given by
Cij (h) = σij M(h|νij , αij )hRd, i, j = 1, . . . , p
where
M(h|ν, α) = 1
2ν1Γ(ν)(khk)νKν(khk)
with Kνbeing the modified Bessel function of the second kind of order ν. The parameter σij is the
covariance between co-located observations from components iand j. We refer to the parameters
1
arXiv:2210.09376v2 [stat.ME] 19 Oct 2022
using the following terminology: σij is a cross covariance parameter, αij is a cross range parameter,
and νij is a cross smoothness parameter. To be consistent with how the Mat´ern is parameterized in
our existing software, our αij parameters are ranges, whereas they are inverse ranges in Gneiting et al.
(2010), Apanasovich et al. (2012), and Emery et al. (2022). When i=j, we refer to them as marginal
parameters. Note that this model implies the symmetry Cij (h) = Cji(h) which need not hold in
general. Li and Zhang (2011) and Qadir et al. (2021) proposed methods for modeling asymmetries.
The univariate Mat´ern model Cii(h) = σiiM(h|νii, αii) provides a valid (i.e nonnegative definite)
second-order structure for the marginal process Zi(x) as long as the marginal parameters σii, αii, νii
are positive. Additional conditions are needed on the cross parameters σij , αij and νij to ensure
that the multivariate Mat´ern model is valid for the multivariate process Z(x). We discuss these in
the next section.
Kleiber (2017) studied the properties of various multivariate spatial models, including separa-
ble models, kernel convolution models, the linear model of coregionalization, and the multivariate
Mat´ern. He found that the multivariate Mat´ern is sufficiently flexible in that it allows the high
frequency coherence to exhibit a range of behaviors, depending on the parameter settings. Loosely
speaking, the coherence between two components is the correlation between the linear combination
of each component and a sinusoidal function; it measures correlation between components at a par-
ticular frequency of variation. Qadir and Sun (2021) demonstrated that further improvements in
flexibility can be achieved with semiparametric models.
Over the past decades, spatial statisticians have produced a mountain of literature on the topic
of estimating covariance parameters, especially on the problem of computing estimates when the
dataset is very large. This work has led to various software packages, including INLA (Rue et al.,
2009; Lindgren et al., 2011), fields (Nychka et al., 2021), RandomFields (Schlather et al., 2022),
spBayes (Finley et al., 2015), spNNGP (Finley et al., 2022), LatticeKrig (Nychka et al., 2016), FRK
(Zammit-Mangion and Cressie, 2021), exageostat (Abdulah et al., 2018), GpGp (Guinness et al.,
2021), GPvecchia (Katzfuss et al., 2020), and GeoModels (Bevilacqua et al., 2018), to name a few.
Most of the research and software development is focused on the univariate case, with the exception
of RandomFields,exageostat, and spBayes, which are capable of fitting bivariate Mat´ern models,
and GeoModels, which has a number of bivariate spatial models.
Clearly, there is a need for reliable software capable of fitting multivariate spatial models with
two or more components to large datasets. In this work, we report on extending the GpGp R package
(Guinness et al., 2021) to handle the multivariate Mat´ern model, demonstrate its capabilities on sev-
eral datasets, and explore the implications of various proposed sufficient conditions on multivariate
Mat´ern parameters. GpGp implements Vecchia’s Gaussian process approximation (Vecchia, 1988),
along with improvements to the approximation (Guinness, 2018), and likelihood optimization proce-
dures that efficiently compute and make use of the gradient and Fisher information (Guinness, 2021).
The application of Vecchia’s approximation is agnostic to the covariance function, which has allowed
for the implementation of more than 25 covariance models in GpGp at the time of writing (package
version 0.4.0), including isotropic, geometrically anisotropic, nonstationary, and spatial-temporal
models on Euclidean spaces and spheres. GpGp has enjoyed success in spatial data competitions,
including being used by the winning team in the first KAUST spatial data analysis competition
(Huang et al., 2021), and by the winners of the multivariate spatial data analysis section of the
second competition (Abdulah et al., 2022).
Adding the multivariate Mat´ern model to GpGp presents difficulties that go far beyond the normal
challenges of implementing a typical univariate covariance function. As opposed to the univariate
Mat´ern, whose parameters must simply be positive in order for the model to be valid, known sufficient
conditions are more complicated, so some care must be taken when enforcing them. The parameter
space is also large; our formulation of the model, which allows for correlated nuggets, has 2p(p+ 1)
parameters. Depending on the dataset, many of the parameters–or combinations thereof–are not well
identified. In short, it is a nasty optimization problem. In order to quickly maximize the likelihood,
one has to take large steps through a high dimensional space fraught with Errors, Infs, and NaNs.
2
R’s optim function does not cut it. In addition, as with all Vecchia approximations, decisions must
be made about how to order the observations and select neighbors, which is complicated by the fact
that the multivariate component of the data is usually categorical rather than numeric.
Our major contribution is the software we provide for fitting multivariate Mat´ern models using
Vecchia’s approximation and a Fisher scoring algorithm. However, the software allows us to explore
the behavior of the multivariate Mat´ern model on various datasets. Our findings are generally
consistent with those of other authors; more flexible conditions on the parameters tend to give
better fits, but there are diminishing returns on added flexibility. Perhaps our most interesting
modeling finding is that allowing the nugget term to be correlated across components can have a
large impact on the estimates of the other covariance parameters.
Section 2 reviews the multivariate Mat´ern model and its various parameterizations. Section 3
outlines Vecchia’s approximation for multivariate spatial data. Section 4 includes some notes on
the optimization procedures. Section 5 describes the datasets. Section 6 presents the results. We
conclude with a discussion.
2 Multivariate Mat´ern Parameterizations
We model the responses as
Yi(x) = µi+Zi(x) + εi(x),
where µiis component-specific mean, Ziis a multivariate Mat´ern process, and εi(x) is a nugget
term with covariances
Cov(εi(x+h), εj(x)) = τij 1(h= 0).
In other words, we assume constant mean within each component, and we add a nugget term but
allow the nugget to be correlated across components. The p×pmatrix formed by the τij parameters
must be positive definite. We parameterize the cross nugget variances as τij = (τiiτjj )1/2Sij ,where
Sis a correlation matrix.
Gneiting et al. (2010) provided necessary and sufficient validity conditions for the bivariate
Mat´ern model parameters. These conditions define the full bivariate model. For three or more
components, necessary and sufficient conditions on the parameters are not known, though various
authors have proposed sufficient conditions, some of which we explore here.
2.1 Parsimonious Model
Gneiting et al. (2010) proved that the multivariate Mat´ern model is valid for p2 if the following
conditions hold for every iand j:
1. αij =α(common marginal and cross ranges)
2. νij = (νii +νjj )/2
3. σij = (σiiσjj )1/2Vij Γ(νii+d/2)1/2
Γ(νii)1/2
Γ(νjj +d/2)1/2
Γ(νjj )1/2
Γ{(νii+νjj )/2}
Γ{(νii+νjj )/2+d/2}where Vis a correlation matrix.
These conditions define their parsimonious model. If we define ρij := σij /(σiiσjj )1/2,then condition
3 implies
|ρij | ≤ Γ(νii +d/2)1/2
Γ(νii)1/2
Γ(νjj +d/2)1/2
Γ(νjj )1/2
Γ{(νii +νjj )/2}
Γ{(νii +νjj )/2 + d/2}
3
which reduces to |ρij | ≤ (νiiνjj )1/2
(νii+νjj )/2when d= 2. In the bivariate case, condition 3 provides a complete
characterization of ρij when 1 and 2 hold. In our software, all correlation matrices, such as Vhere,
use a Cholesky-based parameterization (Pinheiro and Bates, 1996). All parameters that must be
positive use an exponential/log link.
2.2 Flexible-A Model
Apanasovich et al. (2012) provide a different set of sufficient conditions in the p2 case which do
not require a common range parameter or the restriction that νij = (νii +νjj )/2:
1. νij =νii+νjj
2+ ∆A(1 Aij ) where ∆A0 and Ais a positive correlation matrix,
2. (α2
ij )p
i,j=1 is conditionally negative semidefinite,
3. σij = (σiiσjj )1/2Vij (uiiujj )1/2uij , where
uij =α2∆A+νii+νjj
ij Γ(νij {(νii +νjj )/2 + d/2}/Γ(νij +d/2),
and Vis a correlation matrix.
We will refer to the model defined by these conditions as the Flexible-Amodel. As suggested by
Apanasovich et al. (2012), we ensure condition 2 holds by parameterizing
α2
ij =α2
ii +α2
jj
2+ ∆B(1 Bij )
where ∆B0 and Bis a positive correlation matrix. In the bivariate case, Aand Bbecome
redundant.
The conditions of the Flexible-A model are not necessary, so in the bivariate case the full model
of Gneiting et al. (2010) is less restrictive. However, the conditions of the full model are more
complicated to enforce, so it is still worthwhile to evaluate the performance of the Flexible-A model
on bivariate datasets. Apanasovich et al. (2012) illustrate on the same bivariate weather dataset
considered by Gneiting et al. (2010) that both models obtain similar fits.
2.3 Flexible-E Model
Recently Emery et al. (2022) (Theorem 3B) gave another set of sufficient conditions with the goal
of alleviating the restriction imposed on |σij /(σiiσjj )1/2|by the Flexible-A model:
1. (νij )p
i,j=1 is conditionally negative semidefinite,
2. For β > 0, (α2
ij βνij )p
i,j=1 is conditionally negative semidefinite,
3. σij = (σiiσjj )1/2Vij (uiiujj )1/2uij where
uij =α2νij
ij βνij exp(νij )Γ(νij )
and Vis a correlation matrix.
Emery et al. (2022) discuss in some detail how the conditions of the two models are related. A
practical way to ensure that condition 1 of the Flexible-E model holds is to parameterize νij exactly
as in condition 1 of the Flexible-A model. We use this parameterization in our software. Similarly,
we ensure 2 holds by defining
α2
ij =α2
ii +α2
jj
2+ ∆B(1 Bij ) + βνij νii +νjj
2,
where ∆B0, B is a positive correlation matrix, and β > 0. We will refer to the model defined by
these conditions as the Flexible-Emodel.
4
摘要:

VecchiaApproximationsandOptimizationforMultivariateMaternModelsYoussefFahmy,JosephGuinnessCornellUniversity,DepartmentofStatisticsandDataScienceAbstractWedescribeourimplementationofthemultivariateMaternmodelformultivariatespatialdatasets,usingVecchia'sapproximationandaFisherscoringoptimizationalgo...

展开>> 收起<<
Vecchia Approximations and Optimization for Multivariate Mat ern Models Youssef Fahmy Joseph Guinness.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:2.66MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注