Bayesian Convolutional Deep Sets with Task-Dependent Stationary Prior Yohan Jung Jinkyoo Park

2025-05-06 0 0 2.05MB 13 页 10玖币
侵权投诉
Bayesian Convolutional Deep Sets with Task-Dependent
Stationary Prior
Yohan Jung, Jinkyoo Park
KAIST
Abstract
Convolutional deep sets are the architecture of a deep neural network (DNN) that can model stationary
stochastic process. This architecture uses the kernel smoother and the DNN to construct the translation
equivariant functional representations, and thus reflects the inductive bias of the stationarity into DNN.
However, since this architecture employs the kernel smoother known as the non-parametric model, it may
produce ambiguous representations when the number of data points is not given sufficiently. To remedy
this issue, we introduce Bayesian convolutional deep sets that construct the random translation equivariant
functional representations with stationary prior. Furthermore, we present how to impose the task-
dependent prior for each dataset because a wrongly imposed prior forms an even worse representation
than that of the kernel smoother. We validate the proposed architecture and its training on various
experiments with time-series and image datasets.
1Introduction
Neural process (NP) and Conditional neural process (CNP) [
1
,
2
]are the pioneering deep learning
framework for modeling stochastic process, i.e., the functions over the distribution. That is, for any finite
input and output pairs referred as context sets, these NP models output the predictive distribution on
targeted inputs (target sets) by extracting the feature from context sets. Specially, the NP models employ
the Deep sets [
3
]to reflect the exchangeability of the stochastic process into a predictive distribution of the
NP. Many variants of NP [
4
,
5
,
6
,
7
,
8
,
9
,
10
,
11
]have been proposed to model stochastic process elaborately.
Some NP models impose a certain inductive bias to model the stochastic process having the structured
characteristics. For example Convolutional conditional neural process (ConvCNP) [
7
]is a NP model
designed for modeling stationary process of which statistical characteristics over the finite subset of the
process, such as the mean and covariance, do not change even when the time indexes of those finite random
variables are shifted. ConvCNP employs the Convolutional Deep sets (ConvDeepsets) that constructs
the functional representation for stationary process, and thus reflects the inductive bias of stationarity on
ConvCNP.
To construct the translation equivariant representation, ConvDeepsets employs the RBF kernel function
and convolutional neural network (CNN). Specifically, ConvDeepsets first constructs the discretized
functional representation of the context set by using the Nadaraya–Watson kernel smoother [
12
], and then
maps the discretized representation to the abstract representation via CNN. Since the kernel smoother
produces a consistent representation regardless of the translation of the inputs for the context sets, and the
convolution operation preserves the translation equivariance, the corresponding representation could be
used to make the predictive distribution for modeling the stationary process. However, since the kernel
smoother is a non-parametric model whose expressive power is dependent on the amount of the given
context set, the representation of the kernel smoother could be ambiguous when the number of context
data points is not given sufficiently. This can result in the poor performance of the corresponding NP
models because the ConvDeepsets can not produce a proper representation for modeling the target set.
This is analogous to the task ambiguity issue [13]noted in model agnostic meta learning (MAML) [14].
One intuitive approach to attenuate the task ambiguity is to introduce a reasonable prior distribution on
the representation for the kernel smoother. In fact, the Bayesian approach, which imposes prior distribution
arXiv:2210.12363v1 [stat.ML] 22 Oct 2022
on the model parameters, has shown meaningful results for tackling the task ambiguity in MAML [
13
].
However, using the prior distribution also raises the question of which prior distribution should be used.
Extremely, if a wrong prior distribution is assumed for the given datasets, the assigned prior distribution
may affect the representations and the outputs of NP model negatively.
In this work, we propose the Bayesian convolutional Deep sets that constructs random functional
representations via a task-dependent stationary prior. To this end, we first consider a set of stationary
kernel, each of which is characterized by its distinct spectral density. Then, we construct the task-dependent
prior by using an amortized latent categorical variable that is modeled by the translate-invariant neural
network; the latent variable assigns a proper kernel out of the candidate set depending on the task. Next,
we construct the sample functions of the Gaussian process (GP) posterior using the chosen kernel, and
forwarding those sample functions by CNN, which is a representation of a Bayesian ConvDeepsets. We
ensure that Bayesian ConvDeepsets still satisfies the translation equivalence that is necessary for modeling
the stationary process.
For training, we employ the variational inference, and consider additional regularizer that allows the
neural network to chose the stationary prior reasonably depending on the task. We validate that the
proposed method relaxes the task ambiguity issue by assigning a task-dependent prior on the time-series
and the image datasets. Our contributions can be summarised as:
We propose the Bayesian ConvDeepsets using a task-dependent stationary prior and its inference to
attenuate the potential task ambiguity issue of the ConvDeepsets.
We validate that the Bayesian ConvDeepsets can improve the modeling performance of the NP models
on various tasks of the stationary process modeling such as prediction of the time-series and spatial
dataset.
2Preliminaries
Neural Process.
NP uses Deepsets to reflect the exchangeability of the stochastic process into the predictive
distribution of the NP, and employs the meta-learning for training.
Let
Xc={xc
n}Nc
n=1
and
Yc={yc
n}Nc
n=1
be the
Nc
pairs of context inputs and outputs, and
Dc={Xc,Yc}
be the context set. Similarly, let
Xt={xt
n}Nt
n=1
,
Yt={yt
n}Nt
n=1
, and
Dt={Xt,Yt}
be the
Nt
pairs of the target
inputs, outputs, and target set. Then, NP trains the mapping parameterized by neural network
fΘnn
that
maps the context set
Dc
and the target inputs
Xt
to the parameters of the predictive distribution,
µ(Xt)
and σ(Xt), on target input Xt, i.e.,
fΘnn :Dc,Xt7−µnn(Xt),σnn(Xt)(1)
by optimizing the following objective :
max
Θnn
EDc,Dtp(T)log pYt|fΘnn Xt,Dc(2)
where p(T)denotes the distribution of the task for the context set Dcand target set Dt.
Translation Equivariance.
Stationary process is characterized by the property that its statistical character-
istics do not change even when the time is shifted. Thus, functions that can model the stationary process
satisfy the special conditions, referred as Translation Equivariance (TE). Mathematically, TE can be defined
as follows:
Definition 1([7]).Let X=Rdand Y Rd0be space of the inputs and outputs, and let D=
m=1(X ×Y)mbe
the joint space of the finite observations. Also, let Hbe the function spaces on X, and T and Tbe the mappings:
T:X ×D,Tτ(D) = ((x1+τ,y1), .., (xn+τ,yn))
T:X ×H,T
τ(h()) = h(τ)
where
D={(xn,yn)}N
n=1∈ D
denotes
N
pairs of the inputs and outputs,
τ∈ X
denotes translation variable for
the inputs, and
h()∈ H
denote the function for any input
∈ X
. Then, a functional mapping
Φ:D → H
is
translation equivariant if the following holds:
Φ(Tτ(D)) = T
τ(Φ(D)). (3)
Roughly, speaking Definition 1implies that the function satisfying the TE should produce a consistent
functional representation up to the order of translation.
Convolutional Deep Sets.
ConvDeepsets is the specific architecture of the neural network satisfying the
TE in Eq. (3), and thus can be used to model stationary process. The following proposition introduces the
specific structure of the ConvDeepsets Φ(D)().
Proposition 1
([
7
])
.
Given dataset
D={(xn,yn)}N
n=1
, its functional representation
Φ(D)()
is translation
equivariant if and only if Φ(D)()is represented as
E(D)()
| {z }
functional
representation
=hN
n=1
k(xn)
| {z }
density
,
N
n=1
ynk(xn)
N
n=1k(xn)
| {z }
data representation
i,
Φ(D)()
| {z }
ConvDeepsets
representation
=ρ
|{z}
mapping via
CNN
E(D)()
| {z }
functional
representation
(4)
where
k(xn)
denotes the stationary kernel centered at
xn
, and
ρ()
is the continuous and translation equivariant
mapping. Here, the RBF kernel function is used, and ρ()can be parameterized by CNN.
Neural Process with Convolutional Deepsets.
ConvCNP [
7
]and ConvLNP [
8
]are well-known NP
models that can model stationary process by using ConvDeepsets as the main structure of the NP model.
To employ the functional representation of ConvDeepsets in practice, these NP models first consider
M
discretized inputs
{tm}M
m=1[min X, max X]
by spacing the range of inputs
X=XcXt
linearly. Then,
these models construct
M
discretized functional representations
{Φ(Dc)(tm)}M
m=1
on the discretized inputs
{tm}M
m=1with Eq. (4)as,
Φ(Dc)(tm) = (ρE(Dc))(tm)m=1, .., M. (5)
These discretized representations
{Φ(Dc)(tm)}M
m=1
are used to obtain the parameters of the predictive
distribution
µ(Xt)
and
σ(Xt)
as shown in Eq. (1). Specially, the smoothed representations on target inputs
xt
nXt, i.e,
M
m=1
Φ(Dc)(tm)k(xt
ntm)(6)
are used for modeling predictive distribution
p(Yt|Xt,Dc)
. For the grid dataset, we can omit the discretiza-
tion procedure and employ the CNN directly [7].
3Methodology
In this section, we first interpret the representation of the ConvDeepsets and its motivation in Section 3.1.
Then, we introduce the task-dependent stationary prior in Section 3.2, the Bayesian ConvDeepsets in
Section 3.3, and its application to stationary process modeling in Section 3.4.Fig. 2outlines the prediction
procedure via Bayesian ConvDeepsets described in Sections 3.2to 3.4.
摘要:

BayesianConvolutionalDeepSetswithTask-DependentStationaryPriorYohanJung,JinkyooParkKAISTAbstractConvolutionaldeepsetsarethearchitectureofadeepneuralnetwork(DNN)thatcanmodelstationarystochasticprocess.ThisarchitectureusesthekernelsmootherandtheDNNtoconstructthetranslationequivariantfunctionalrepresen...

展开>> 收起<<
Bayesian Convolutional Deep Sets with Task-Dependent Stationary Prior Yohan Jung Jinkyoo Park.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:2.05MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注