Deep Spatial Domain Generalization Dazhou Yu Guangji Bai Yun Li Liang Zhao Department of Computer Science

2025-05-06 0 0 477.95KB 6 页 10玖币
侵权投诉
Deep Spatial Domain Generalization
Dazhou Yu, Guangji Bai, Yun Li, Liang Zhao
Department of Computer Science
Emory University
Atlanta, USA
{Dazhou.Yu, Guangji.Bai, Yun.Li, Liang.Zhao}@emory.edu
Abstract—Spatial autocorrelation and spatial heterogeneity
widely exist in spatial data, which make the traditional machine
learning model perform badly. Spatial domain generalization
is a spatial extension of domain generalization, which can
generalize to unseen spatial domains in continuous 2D space.
Specifically, it learns a model under varying data distributions
that generalizes to unseen domains. Although tremendous success
has been achieved in domain generalization, there exist very
few works on spatial domain generalization. The advancement
of this area is challenged by: 1) Difficulty in characterizing
spatial heterogeneity, and 2) Difficulty in obtaining predictive
models for unseen locations without training data. To address
these challenges, this paper proposes a generic framework for
spatial domain generalization. Specifically, We develop the spatial
interpolation graph neural network 1that handles spatial data as
a graph and learns the spatial embedding on each node and their
relationships. The spatial interpolation graph neural network
infers the spatial embedding of an unseen location during the test
phase. Then the spatial embedding of the target location is used
to decode the parameters of the downstream-task model directly
on the target location. Finally, extensive experiments on ten real-
world datasets demonstrate the proposed method’s strength.
Index Terms—unseen domain generalization, spatial, GNN,
edge embedding, interpolation
I. INTRODUCTION
Traditional machine learning models are typically under
the independent and identically distributed (i.i.d.) assumption,
meaning the data samples are independent of each other
and follow the same distribution. However, this assumption
generally cannot be held for spatial data which have spatial
autocorrelation and heterogeneity. Spatial autocorrelation makes
the spatial location of a sample and corresponding spatial at-
tributes informative and samples not independent and identically
distributed (non-i.i.d.). Spatial heterogeneity includes spatial
non-stationarity and spatial anisotropy. Spatial non-stationarity
means that sample distribution varies across locations. Spatial
anisotropy means that the spatial dependency between sample
locations is non-uniform along different locations. Specifically,
the air pollution concentration of a location is usually a
complex function of various independent variables but the
relative importance of the independent variables are changing
with locations, e.g., the population density and distances from
emissions sources play an essential role in PM2.5 pollution
concentration in Urban built-up areas. But in rural areas, the
relative humidity is greatly attributed to the diffusion of PM2.5.
1https://github.com/dyu62/Deep-domain-generalization
This requires us to have some customization on different models
in different locations. However, in the training set, we usually
only have observations from a limited number of locations.
Hence, it is prevalent that we need to execute prediction tasks
in locations unseen in the training set. This results in a very
challenging task where we need to predict the model in a
new location without any training data. This paper focuses on
this new problem which we call spatial domain generalization,
which is a spatial extension of domain generalization [1].
Domain generalization learns a model under varying data
distributions that generalizes to unseen domains. It is derived
from and goes beyond domain adaptation, which builds the
bridge between source and target domains by characterizing
the transformation between the data from these domains
[2]. Current domain generalization only covers domains with
categorical indices [1] or time sequential domains [3] but
has not covered spatial domains which require considering
unique problems such as spatial autocorrelation and spatial
heterogeneity. Another thread of research comes from the
spatial data mining area, where people propose techniques such
as Geographically weighted regression (GWR) [4] to handle
spatial heterogeneity. Most of the time, prescribed models are
used where the underlying spatial distribution and correlation
need to be presumed and predefined by the model designer
which may not reflect the true spatial process that is usually
complex and unknown. Especially, these models only consider
distances and ignore other spatial information such as direction.
What’s more, these models share the feature extractor on all
locations and only generate different coefficients in the last
layer so they cannot capture complex heterogeneity within
data.
The spatial domain generalization is challenged by several
critical bottlenecks, including
1) Difficulty in characterizing
spatial heterogeneity.
The data distribution is not identical
in the entire space and is changing with respect to locations’
confounding and characteristics. A simple global model cannot
explain the relationships between variables. So the nature
of the model must alter over space to reflect the structure
within the data. Modeling the spatially changing relationships
requires making the model location-sensitive. Feeding the
coordinate values as part of input features is intuitive. However,
such a method cannot leverage the fact of the other features’
dependency on location and other confounding factors varying
among locations. It is necessary yet difficult to quantitatively
arXiv:2210.00729v2 [cs.LG] 28 Dec 2022
figure out how the spatial heterogeneity impacts the models
while there is no "one-fits-all" rule for it. It is highly imperative
yet challenging to have some techniques that can automatically
learn from the data.
2) Difficulty in obtaining predictive
models for unseen locations without training data.
Due to
the spatial heterogeneity, the local models in different locations
can be very different in order to capture the relationships
between predictors and the target variable. When training data
is not provided in some locations, the method must have the
capacity to generalize to these unseen locations. This is as
difficult as zero-shot learning.
In order to address the above challenges, we propose a
generic framework for deep spatial domain generalization,
which generates the predictive models for any unseen spatial
domains. More specifically, to address the first challenge, we
propose a novel spatial interpolation graph neural network
(SIGNN) to learn the spatial embedding of each location and
the relationships between them in the training set and infer the
spatial embedding of unseen locations during the test phase.
The spatial embedding of the target location is then used to
decode the parameterized model directly without training data
on the target location. This solves the second challenge. Our
contribution includes
We propose a framework for spatial domain gen-
eralization.
The framework doesn’t assume the data
distribution and learns the spatial embeddings for all the
locations in the training set in an end-to-end manner. It is
also compatible with general predictive task models such
as regression models and multi-layer perceptrons (MLP).
We develop the spatial interpolation graph neural
network.
It handles spatial data as a graph and uses the
edge representation to learn the spatial embedding on each
node and their relationships by doing graph convolution
operations. It also interpolates the spatial embedding at
any location so our method can generalize to unseen
locations.
We conduct extensive experiments.
We validated the
efficacy of our method on ten real-world datasets for clas-
sification and regression tasks. Our method outperforms
state-of-the-art models on most of the tasks.
II. RELATED WORK
In this section, we summarize the works in the field of
domain adaptation and domain generalization. Machine learning
systems often assume that training and test data follow the
same distribution, which, however, usually cannot be satisfied
in practice. Domain Adaptation (DA) aims to build the bridge
between source and target domains by characterizing the
transformation between the data from these domains [2],
[5], [6]. Domain Adaptation (DA) has received great attention
from researchers in the past decade [2], [5], [6]. Under the
big umbrella of DA, continuous domain adaptation considers
the problem of adapting to target domains where the domain
index is a continuous variable (temporal DA is a special case
when the domain index is 1D). Approaches to tackling such
problems can be broadly classified into three categories: (1)
biasing the training loss towards future data via transportation
of past data [7], (2) using time-sensitive network parameters
and explicitly controlling their evolution along time [8], (3)
learning representations that are time-invariant using adversarial
methods [9]. The first category augments the training data,
the second category reparameterizes the model, and the third
category redesigns the training objective. However, data may
not be available for the target domain, or it may not be possible
to adapt the base model, thus requiring Domain Generalization.
A diversity of DG methods have been proposed in recent
years. According to [10], existing DG methods can be cat-
egorized into the following three groups, namely: (1) Data
manipulation: This category of methods focuses on manipulat-
ing the inputs to assist in learning general representations. There
are two kinds of popular techniques along this line: a). Data
augmentation [11], which is mainly based on augmentation,
randomization, and transformation of input data; b). Data
generation [12], which generates diverse samples to help
generalization. (2) Representation learning: This category of
methods is the most popular in domain generalization. There are
two representative techniques: a). Domain-invariant representa-
tion learning [5], which performs kernel, adversarial training,
explicitly features alignment between domains, or invariant
risk minimization to learn domain-invariant representations;
b). Feature disentanglement [13], which tries to disentangle
the features into domain-shared or domain-specific parts for
better generalization. (3) Learning strategy: This category of
methods focuses on exploiting the general learning strategy to
promote the generalization capability.
III. METHODOLOGY
In this section, we first provide the problem formulation and
the challenges of the problem, then we introduce our proposed
framework and how it solves the challenges.
A. Problem formulation
In this paper, we denote a geo-location by its 2D coordinate
values
sR2
, and each
s
is associated with a spatial domain
(Xs× Ys)
, where we could have a set of samples
(xs,ys) =
{(xi, yi)(Xs× Ys)}Ns
i=1
where
xi∈ X
is
i
-th input sample
from the domain
Xs
, while
yi∈ Y
is the
i
-th output sample
from the domain
Ys
. For the classification problem,
yi
can be
further narrowed to a binary value.
In opposition to an assumption that the relationship
f
remains
unchanged among dependent variables
xi∈ Xs
and indepen-
dent variables
yi∈ Ys
in the space
R2
, spatial heterogeneity
describes a condition in which the relationships between some
sets of variables
{xi, yi}
are heterogeneous throughout space,
i.e.,
fs6=fs0
if
s6=s0
. A static global model cannot capture
the changes in relationships, thus Domain Generalization (DG)
models which could reflect the heterogeneous relationships
within the data play a vital role in spatial analysis.
Our goal in this paper is to build a model that proactively
captures the data concept drift across different geo-locations.
Given a set of data samples
{(xs,ys)}sS0
from seen domains,
where
S0
denotes the set of seen locations, we aim to learn the
摘要:

DeepSpatialDomainGeneralizationDazhouYu,GuangjiBai,YunLi,LiangZhaoDepartmentofComputerScienceEmoryUniversityAtlanta,USA{Dazhou.Yu,Guangji.Bai,Yun.Li,Liang.Zhao}@emory.eduAbstract—Spatialautocorrelationandspatialheterogeneitywidelyexistinspatialdata,whichmakethetraditionalmachinelearningmodelperformb...

展开>> 收起<<
Deep Spatial Domain Generalization Dazhou Yu Guangji Bai Yun Li Liang Zhao Department of Computer Science.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:6 页 大小:477.95KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注