Deep Spatial Domain Generalization Dazhou Yu Guangji Bai Yun Li Liang Zhao Department of Computer Science

2025-05-06 0 0 477.95KB 6 页 10玖币

侵权投诉

Deep Spatial Domain Generalization

Dazhou Yu, Guangji Bai, Yun Li, Liang Zhao

Department of Computer Science

Emory University

Atlanta, USA

{Dazhou.Yu, Guangji.Bai, Yun.Li, Liang.Zhao}@emory.edu

Abstract—Spatial autocorrelation and spatial heterogeneity

widely exist in spatial data, which make the traditional machine

learning model perform badly. Spatial domain generalization

is a spatial extension of domain generalization, which can

generalize to unseen spatial domains in continuous 2D space.

Speciﬁcally, it learns a model under varying data distributions

that generalizes to unseen domains. Although tremendous success

has been achieved in domain generalization, there exist very

few works on spatial domain generalization. The advancement

of this area is challenged by: 1) Difﬁculty in characterizing

spatial heterogeneity, and 2) Difﬁculty in obtaining predictive

models for unseen locations without training data. To address

these challenges, this paper proposes a generic framework for

spatial domain generalization. Speciﬁcally, We develop the spatial

interpolation graph neural network 1that handles spatial data as

a graph and learns the spatial embedding on each node and their

relationships. The spatial interpolation graph neural network

infers the spatial embedding of an unseen location during the test

phase. Then the spatial embedding of the target location is used

to decode the parameters of the downstream-task model directly

on the target location. Finally, extensive experiments on ten real-

world datasets demonstrate the proposed method’s strength.

Index Terms—unseen domain generalization, spatial, GNN,

edge embedding, interpolation

I. INTRODUCTION

Traditional machine learning models are typically under

the independent and identically distributed (i.i.d.) assumption,

meaning the data samples are independent of each other

and follow the same distribution. However, this assumption

generally cannot be held for spatial data which have spatial

autocorrelation and heterogeneity. Spatial autocorrelation makes

the spatial location of a sample and corresponding spatial at-

tributes informative and samples not independent and identically

distributed (non-i.i.d.). Spatial heterogeneity includes spatial

non-stationarity and spatial anisotropy. Spatial non-stationarity

means that sample distribution varies across locations. Spatial

anisotropy means that the spatial dependency between sample

locations is non-uniform along different locations. Speciﬁcally,

the air pollution concentration of a location is usually a

complex function of various independent variables but the

relative importance of the independent variables are changing

with locations, e.g., the population density and distances from

emissions sources play an essential role in PM2.5 pollution

concentration in Urban built-up areas. But in rural areas, the

relative humidity is greatly attributed to the diffusion of PM2.5.

1https://github.com/dyu62/Deep-domain-generalization

This requires us to have some customization on different models

in different locations. However, in the training set, we usually

only have observations from a limited number of locations.

Hence, it is prevalent that we need to execute prediction tasks

in locations unseen in the training set. This results in a very

challenging task where we need to predict the model in a

new location without any training data. This paper focuses on

this new problem which we call spatial domain generalization,

which is a spatial extension of domain generalization [1].

Domain generalization learns a model under varying data

distributions that generalizes to unseen domains. It is derived

from and goes beyond domain adaptation, which builds the

bridge between source and target domains by characterizing

the transformation between the data from these domains

[2]. Current domain generalization only covers domains with

categorical indices [1] or time sequential domains [3] but

has not covered spatial domains which require considering

unique problems such as spatial autocorrelation and spatial

heterogeneity. Another thread of research comes from the

spatial data mining area, where people propose techniques such

as Geographically weighted regression (GWR) [4] to handle

spatial heterogeneity. Most of the time, prescribed models are

used where the underlying spatial distribution and correlation

need to be presumed and predeﬁned by the model designer

which may not reﬂect the true spatial process that is usually

complex and unknown. Especially, these models only consider

distances and ignore other spatial information such as direction.

What’s more, these models share the feature extractor on all

locations and only generate different coefﬁcients in the last

layer so they cannot capture complex heterogeneity within

data.

The spatial domain generalization is challenged by several

critical bottlenecks, including

1) Difﬁculty in characterizing

spatial heterogeneity.

The data distribution is not identical

in the entire space and is changing with respect to locations’

confounding and characteristics. A simple global model cannot

explain the relationships between variables. So the nature

of the model must alter over space to reﬂect the structure

within the data. Modeling the spatially changing relationships

requires making the model location-sensitive. Feeding the

coordinate values as part of input features is intuitive. However,

such a method cannot leverage the fact of the other features’

dependency on location and other confounding factors varying

among locations. It is necessary yet difﬁcult to quantitatively

arXiv:2210.00729v2 [cs.LG] 28 Dec 2022

ﬁgure out how the spatial heterogeneity impacts the models

while there is no "one-ﬁts-all" rule for it. It is highly imperative

yet challenging to have some techniques that can automatically

learn from the data.

2) Difﬁculty in obtaining predictive

models for unseen locations without training data.

Due to

the spatial heterogeneity, the local models in different locations

can be very different in order to capture the relationships

between predictors and the target variable. When training data

is not provided in some locations, the method must have the

capacity to generalize to these unseen locations. This is as

difﬁcult as zero-shot learning.

In order to address the above challenges, we propose a

generic framework for deep spatial domain generalization,

which generates the predictive models for any unseen spatial

domains. More speciﬁcally, to address the ﬁrst challenge, we

propose a novel spatial interpolation graph neural network

(SIGNN) to learn the spatial embedding of each location and

the relationships between them in the training set and infer the

spatial embedding of unseen locations during the test phase.

The spatial embedding of the target location is then used to

decode the parameterized model directly without training data

on the target location. This solves the second challenge. Our

contribution includes

•We propose a framework for spatial domain gen-

eralization.

The framework doesn’t assume the data

distribution and learns the spatial embeddings for all the

locations in the training set in an end-to-end manner. It is

also compatible with general predictive task models such

as regression models and multi-layer perceptrons (MLP).

•We develop the spatial interpolation graph neural

network.

It handles spatial data as a graph and uses the

edge representation to learn the spatial embedding on each

node and their relationships by doing graph convolution

operations. It also interpolates the spatial embedding at

any location so our method can generalize to unseen

locations.

•We conduct extensive experiments.

We validated the

efﬁcacy of our method on ten real-world datasets for clas-

siﬁcation and regression tasks. Our method outperforms

state-of-the-art models on most of the tasks.

II. RELATED WORK

In this section, we summarize the works in the ﬁeld of

domain adaptation and domain generalization. Machine learning

systems often assume that training and test data follow the

same distribution, which, however, usually cannot be satisﬁed

in practice. Domain Adaptation (DA) aims to build the bridge

between source and target domains by characterizing the

transformation between the data from these domains [2],

[5], [6]. Domain Adaptation (DA) has received great attention

from researchers in the past decade [2], [5], [6]. Under the

big umbrella of DA, continuous domain adaptation considers

the problem of adapting to target domains where the domain

index is a continuous variable (temporal DA is a special case

when the domain index is 1D). Approaches to tackling such

problems can be broadly classiﬁed into three categories: (1)

biasing the training loss towards future data via transportation

of past data [7], (2) using time-sensitive network parameters

and explicitly controlling their evolution along time [8], (3)

learning representations that are time-invariant using adversarial

methods [9]. The ﬁrst category augments the training data,

the second category reparameterizes the model, and the third

category redesigns the training objective. However, data may

not be available for the target domain, or it may not be possible

to adapt the base model, thus requiring Domain Generalization.

A diversity of DG methods have been proposed in recent

years. According to [10], existing DG methods can be cat-

egorized into the following three groups, namely: (1) Data

manipulation: This category of methods focuses on manipulat-

ing the inputs to assist in learning general representations. There

are two kinds of popular techniques along this line: a). Data

augmentation [11], which is mainly based on augmentation,

randomization, and transformation of input data; b). Data

generation [12], which generates diverse samples to help

generalization. (2) Representation learning: This category of

methods is the most popular in domain generalization. There are

two representative techniques: a). Domain-invariant representa-

tion learning [5], which performs kernel, adversarial training,

explicitly features alignment between domains, or invariant

risk minimization to learn domain-invariant representations;

b). Feature disentanglement [13], which tries to disentangle

the features into domain-shared or domain-speciﬁc parts for

better generalization. (3) Learning strategy: This category of

methods focuses on exploiting the general learning strategy to

promote the generalization capability.

III. METHODOLOGY

In this section, we ﬁrst provide the problem formulation and

the challenges of the problem, then we introduce our proposed

framework and how it solves the challenges.

A. Problem formulation

In this paper, we denote a geo-location by its 2D coordinate

values

s∈R2

, and each

is associated with a spatial domain

(Xs× Ys)

, where we could have a set of samples

(xs,ys) =

{(xi, yi)∈(Xs× Ys)}Ns

i=1

where

xi∈ X

-th input sample

from the domain

, while

yi∈ Y

is the

-th output sample

from the domain

. For the classiﬁcation problem,

can be

further narrowed to a binary value.

In opposition to an assumption that the relationship

remains

unchanged among dependent variables

xi∈ Xs

and indepen-

dent variables

yi∈ Ys

in the space

, spatial heterogeneity

describes a condition in which the relationships between some

sets of variables

{xi, yi}

are heterogeneous throughout space,

i.e.,

fs6=fs0

s6=s0

. A static global model cannot capture

the changes in relationships, thus Domain Generalization (DG)

models which could reﬂect the heterogeneous relationships

within the data play a vital role in spatial analysis.

Our goal in this paper is to build a model that proactively

captures the data concept drift across different geo-locations.

Given a set of data samples

{(xs,ys)}s∈S0

from seen domains,

where

denotes the set of seen locations, we aim to learn the

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DeepSpatialDomainGeneralizationDazhouYu,GuangjiBai,YunLi,LiangZhaoDepartmentofComputerScienceEmoryUniversityAtlanta,USA{Dazhou.Yu,Guangji.Bai,Yun.Li,Liang.Zhao}@emory.eduAbstractSpatialautocorrelationandspatialheterogeneitywidelyexistinspatialdata,whichmakethetraditionalmachinelearningmodelperformb...

展开>> 收起<<

Deep Spatial Domain Generalization Dazhou Yu Guangji Bai Yun Li Liang Zhao Department of Computer Science.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Deep Spatial Domain Generalization Dazhou Yu Guangji Bai Yun Li Liang Zhao Department of Computer Science

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: