Bayesian Convolutional Deep Sets with Task-Dependent Stationary Prior Yohan Jung Jinkyoo Park

2025-05-06 0 0 2.05MB 13 页 10玖币

侵权投诉

Bayesian Convolutional Deep Sets with Task-Dependent

Stationary Prior

Yohan Jung, Jinkyoo Park

KAIST

Abstract

Convolutional deep sets are the architecture of a deep neural network (DNN) that can model stationary

stochastic process. This architecture uses the kernel smoother and the DNN to construct the translation

equivariant functional representations, and thus reﬂects the inductive bias of the stationarity into DNN.

However, since this architecture employs the kernel smoother known as the non-parametric model, it may

produce ambiguous representations when the number of data points is not given sufﬁciently. To remedy

this issue, we introduce Bayesian convolutional deep sets that construct the random translation equivariant

functional representations with stationary prior. Furthermore, we present how to impose the task-

dependent prior for each dataset because a wrongly imposed prior forms an even worse representation

than that of the kernel smoother. We validate the proposed architecture and its training on various

experiments with time-series and image datasets.

1Introduction

Neural process (NP) and Conditional neural process (CNP) [

]are the pioneering deep learning

framework for modeling stochastic process, i.e., the functions over the distribution. That is, for any ﬁnite

input and output pairs referred as context sets, these NP models output the predictive distribution on

targeted inputs (target sets) by extracting the feature from context sets. Specially, the NP models employ

the Deep sets [

]to reﬂect the exchangeability of the stochastic process into a predictive distribution of the

NP. Many variants of NP [

]have been proposed to model stochastic process elaborately.

Some NP models impose a certain inductive bias to model the stochastic process having the structured

characteristics. For example Convolutional conditional neural process (ConvCNP) [

]is a NP model

designed for modeling stationary process of which statistical characteristics over the ﬁnite subset of the

process, such as the mean and covariance, do not change even when the time indexes of those ﬁnite random

variables are shifted. ConvCNP employs the Convolutional Deep sets (ConvDeepsets) that constructs

the functional representation for stationary process, and thus reﬂects the inductive bias of stationarity on

ConvCNP.

To construct the translation equivariant representation, ConvDeepsets employs the RBF kernel function

and convolutional neural network (CNN). Speciﬁcally, ConvDeepsets ﬁrst constructs the discretized

functional representation of the context set by using the Nadaraya–Watson kernel smoother [

], and then

maps the discretized representation to the abstract representation via CNN. Since the kernel smoother

produces a consistent representation regardless of the translation of the inputs for the context sets, and the

convolution operation preserves the translation equivariance, the corresponding representation could be

used to make the predictive distribution for modeling the stationary process. However, since the kernel

smoother is a non-parametric model whose expressive power is dependent on the amount of the given

context set, the representation of the kernel smoother could be ambiguous when the number of context

data points is not given sufﬁciently. This can result in the poor performance of the corresponding NP

models because the ConvDeepsets can not produce a proper representation for modeling the target set.

This is analogous to the task ambiguity issue [13]noted in model agnostic meta learning (MAML) [14].

One intuitive approach to attenuate the task ambiguity is to introduce a reasonable prior distribution on

the representation for the kernel smoother. In fact, the Bayesian approach, which imposes prior distribution

arXiv:2210.12363v1 [stat.ML] 22 Oct 2022

on the model parameters, has shown meaningful results for tackling the task ambiguity in MAML [

However, using the prior distribution also raises the question of which prior distribution should be used.

Extremely, if a wrong prior distribution is assumed for the given datasets, the assigned prior distribution

may affect the representations and the outputs of NP model negatively.

In this work, we propose the Bayesian convolutional Deep sets that constructs random functional

representations via a task-dependent stationary prior. To this end, we ﬁrst consider a set of stationary

kernel, each of which is characterized by its distinct spectral density. Then, we construct the task-dependent

prior by using an amortized latent categorical variable that is modeled by the translate-invariant neural

network; the latent variable assigns a proper kernel out of the candidate set depending on the task. Next,

we construct the sample functions of the Gaussian process (GP) posterior using the chosen kernel, and

forwarding those sample functions by CNN, which is a representation of a Bayesian ConvDeepsets. We

ensure that Bayesian ConvDeepsets still satisﬁes the translation equivalence that is necessary for modeling

the stationary process.

For training, we employ the variational inference, and consider additional regularizer that allows the

neural network to chose the stationary prior reasonably depending on the task. We validate that the

proposed method relaxes the task ambiguity issue by assigning a task-dependent prior on the time-series

and the image datasets. Our contributions can be summarised as:

•

We propose the Bayesian ConvDeepsets using a task-dependent stationary prior and its inference to

attenuate the potential task ambiguity issue of the ConvDeepsets.

•

We validate that the Bayesian ConvDeepsets can improve the modeling performance of the NP models

on various tasks of the stationary process modeling such as prediction of the time-series and spatial

dataset.

2Preliminaries

Neural Process.

NP uses Deepsets to reﬂect the exchangeability of the stochastic process into the predictive

distribution of the NP, and employs the meta-learning for training.

Let

Xc={xc

n}Nc

n=1

and

Yc={yc

n}Nc

n=1

be the

pairs of context inputs and outputs, and

Dc={Xc,Yc}

be the context set. Similarly, let

Xt={xt

n}Nt

n=1

Yt={yt

n}Nt

n=1

, and

Dt={Xt,Yt}

be the

pairs of the target

inputs, outputs, and target set. Then, NP trains the mapping parameterized by neural network

fΘnn

that

maps the context set

and the target inputs

to the parameters of the predictive distribution,

µ(Xt)

and σ(Xt), on target input Xt, i.e.,

fΘnn :Dc,Xt7−→ µnn(Xt),σnn(Xt)(1)

by optimizing the following objective :

max

Θnn

EDc,Dt∼p(T)log pYt|fΘnn Xt,Dc (2)

where p(T)denotes the distribution of the task for the context set Dcand target set Dt.

Translation Equivariance.

Stationary process is characterized by the property that its statistical character-

istics do not change even when the time is shifted. Thus, functions that can model the stationary process

satisfy the special conditions, referred as Translation Equivariance (TE). Mathematically, TE can be deﬁned

as follows:

Deﬁnition 1([7]).Let X=Rdand Y ⊂ Rd0be space of the inputs and outputs, and let D=∪∞

m=1(X ×Y)mbe

the joint space of the ﬁnite observations. Also, let Hbe the function spaces on X, and T and T∗be the mappings:

T:X ×D,Tτ(D) = ((x1+τ,y1), .., (xn+τ,yn))

T∗:X ×H,T∗

τ(h(•)) = h(•−τ)

where

D={(xn,yn)}N

n=1∈ D

denotes

pairs of the inputs and outputs,

τ∈ X

denotes translation variable for

the inputs, and

h(•)∈ H

denote the function for any input

•∈ X

. Then, a functional mapping

Φ:D → H

translation equivariant if the following holds:

Φ◦(Tτ(D)) = T∗

τ◦(Φ(D)). (3)

Roughly, speaking Deﬁnition 1implies that the function satisfying the TE should produce a consistent

functional representation up to the order of translation.

Convolutional Deep Sets.

ConvDeepsets is the speciﬁc architecture of the neural network satisfying the

TE in Eq. (3), and thus can be used to model stationary process. The following proposition introduces the

speciﬁc structure of the ConvDeepsets Φ(D)(•).

Proposition 1

([

])

Given dataset

D={(xn,yn)}N

n=1

, its functional representation

Φ(D)(•)

is translation

equivariant if and only if Φ(D)(•)is represented as

E(D)(•)

| {z }

functional

representation

=hN

∑

n=1

k(•−xn)

| {z }

density

∑

n=1

ynk(•−xn)

∑N

n=1k(•−xn)

| {z }

data representation

Φ(D)(•)

| {z }

ConvDeepsets

representation

=ρ

|{z}

mapping via

CNN

◦E(D)(•)

| {z }

functional

representation

(4)

where

k(•−xn)

denotes the stationary kernel centered at

, and

ρ(•)

is the continuous and translation equivariant

mapping. Here, the RBF kernel function is used, and ρ(•)can be parameterized by CNN.

Neural Process with Convolutional Deepsets.

ConvCNP [

]and ConvLNP [

]are well-known NP

models that can model stationary process by using ConvDeepsets as the main structure of the NP model.

To employ the functional representation of ConvDeepsets in practice, these NP models ﬁrst consider

discretized inputs

{tm}M

m=1⊂[min X, max X]

by spacing the range of inputs

X=Xc∪Xt

linearly. Then,

these models construct

discretized functional representations

{Φ(Dc)(tm)}M

m=1

on the discretized inputs

{tm}M

m=1with Eq. (4)as,

Φ(Dc)(tm) = (ρ◦E(Dc))(tm)m=1, .., M. (5)

These discretized representations

{Φ(Dc)(tm)}M

m=1

are used to obtain the parameters of the predictive

distribution

µ(Xt)

and

σ(Xt)

as shown in Eq. (1). Specially, the smoothed representations on target inputs

n∈Xt, i.e,

∑

m=1

Φ(Dc)(tm)k(xt

n−tm)(6)

are used for modeling predictive distribution

p(Yt|Xt,Dc)

. For the grid dataset, we can omit the discretiza-

tion procedure and employ the CNN directly [7].

3Methodology

In this section, we ﬁrst interpret the representation of the ConvDeepsets and its motivation in Section 3.1.

Then, we introduce the task-dependent stationary prior in Section 3.2, the Bayesian ConvDeepsets in

Section 3.3, and its application to stationary process modeling in Section 3.4.Fig. 2outlines the prediction

procedure via Bayesian ConvDeepsets described in Sections 3.2to 3.4.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

BayesianConvolutionalDeepSetswithTask-DependentStationaryPriorYohanJung,JinkyooParkKAISTAbstractConvolutionaldeepsetsarethearchitectureofadeepneuralnetwork(DNN)thatcanmodelstationarystochasticprocess.ThisarchitectureusesthekernelsmootherandtheDNNtoconstructthetranslationequivariantfunctionalrepresen...

展开>> 收起<<

Bayesian Convolutional Deep Sets with Task-Dependent Stationary Prior Yohan Jung Jinkyoo Park.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Bayesian Convolutional Deep Sets with Task-Dependent Stationary Prior Yohan Jung Jinkyoo Park

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: