A Continuous Convolutional Trainable Filter for Modelling Unstructured Data Dario Coscia1 Laura Meneghetti1 Nicola Demo1 Giovanni Stabile21 and

2025-04-30 0 0 2.85MB 17 页 10玖币

侵权投诉

A Continuous Convolutional Trainable Filter for Modelling

Unstructured Data

Dario Coscia∗1, Laura Meneghetti†1, Nicola Demo‡1, Giovanni Stabile§2,1, and

Gianluigi Rozza¶1

1Mathematics Area, mathLab, SISSA, via Bonomea 265, I-34136, Trieste, Italy

2Department of Pure and Applied Sciences, Informatics and Mathematics Section,

University of Urbino Carlo Bo, Piazza della Repubblica 13, I-61029, Urbino, Italy

May 26, 2023

Abstract

Convolutional Neural Network (CNN) is one of the most important architectures in deep

learning. The fundamental building block of a CNN is a trainable ﬁlter, represented as a

discrete grid, used to perform convolution on discrete input data. In this work, we propose a

continuous version of a trainable convolutional ﬁlter able to work also with unstructured data.

This new framework allows exploring CNNs beyond discrete domains, enlarging the usage of

this important learning technique for many more complex problems. Our experiments show

that the continuous ﬁlter can achieve a level of accuracy comparable to the state-of-the-art

discrete ﬁlter, and that it can be used in current deep learning architectures as a building

block to solve problems with unstructured domains as well.

1 Introduction

In the deep learning ﬁeld, a convolutional neural network (CNN) [28] is one of the most important

architectures, widely used in academia and industrial research. For an overview of the topic,

the interested reader might refer to [30, 16, 2, 5, 52]. Despite the great success in many ﬁelds

including, but not limited, to computer vision [26, 40, 22] or natural language processing [50, 11],

current CNNs are constrained to structural data. Indeed, the basic building block of a CNN is

a trainable ﬁlter, represented by a discrete grid, which performs cross-correlation, also known as

convolution, on a discrete domain. Nevertheless, the idea behind convolution can be easily extended

mathematically to unstructured domains, for reference see [18]. One possible approach for this kind

of problem is the graph neural networks (GNN) [24, 49], where a graph is built starting from the

topology of the discretized space. This allows us to apply convolution even to unstructured data by

looking at the graph edges, bypassing in this way the limitations of the standard CNNs approach.

However, GNNs typically require huge computational resources, due to their implicit complexity.

Instead in this article, we present a methodology to apply CNNs to unstructured data by intro-

ducing a continuous extension of a convolutional ﬁlter, named continuous ﬁlter, without modeling

the data using a graph. The main idea, which is depicted graphically in Figure 1, relies on ap-

proximating the continuous ﬁlter with a trainable function using a feed-forward neural network

and perform standard continuous convolution between the input data and the continuous ﬁlter.

Previous works have introduced diﬀerent approaches to continuous convolution in various settings

ranging from informatics and graph neural networks to physics and modeling quantum interac-

tions, see for example [39, 41, 4]. Even so, the latter is diﬃcult to generalize, and an analogy

with a discrete CNN ﬁlter is not straightforward. To our extent [48, 36] are the closest works in

∗dario.coscia@sissa.it

†laura.meneghetti@sissa.it

‡nicola.demo@sissa.it

§giovanni.stabile@uniurb.it

¶gianluigi.rozza@sissa.it

arXiv:2210.13416v3 [cs.LG] 25 May 2023

MLP Kernel

✱

Figure 1: Continuous convolutional ﬁlter process. The unstructured domain input points falling

into the ﬁlter are mapped in the ﬁlter domain (a). The ﬁlter values are approximated with a MLP

kernel (b). Finally, the convolution between the mapped values and the ﬁlter values is performed

(c).

literature to our approach, both approximating the trainable ﬁlter function with a feed-forward

neural network and performing continuous convolution. However, [48] and [36] focus on ﬁlters with

unbounded domains for convolution. In our work, we instead ﬁx the dimension of the ﬁlter, as in

state of the art discrete ﬁlters, and learn the approximation function on the ﬁlter domain. This

introduces a neat analogy to discrete CNN ﬁlters. Furthermore, diﬀerently from [48, 36], we also

cover important properties of convolution, such as transposed convolution or diﬀerent approaches

to multichannel convolution. To summarize, in this work we aim to reproduce as closely as possible

a discrete CNN ﬁlter but in a continuous not structured domain setting, in order to exploit the

main deep learning architectures, based on CNNs, to solve problems in not discrete domains. To

the best of the authors’ knowledge, our approach to continuous convolution has not been explored

in literature yet.

The main novelties of this work rely on:

•Building a new framework, based on continuous ﬁlters, for working with unstructured data

(continuous ﬁlter).

•Deﬁning a neat analogy between continuous (transposed) convolution and state of the art

discrete (transposed) convolution in CNNs.

•Apply continuous convolutional layers in a CNN with partially-completed input.

•Exploiting general strategies to work with continuous convolutional autoencoders for dimen-

sionality reduction and system output predictions at unseen time steps.

All this, we highlight, preserving the features of the standard CNNs, which make such an approach

eﬀective even dealing with large datasets. The present contribution is organised as follows: in

Section 2, a small review of deep learning architectures useful for later analysis is done, as well as

introducing the continuous ﬁlter for one-dimensional and multi-dimensional channels. In the same

Section, we introduce the main idea to perform transposed continuous convolution. Section 3 is

focused on numerical results. First, we validate the proposed methodology on a discrete domain

problem using a continuous CNN and compare it with its discrete representation. Second, we

show that continuous convolution can also work with partially-completed images. Last, we present

diﬀerent deep learning architectures using continuous ﬁlters to solve the step Navier Stokes problem,

and the multiphase problem. Finally, conclusions follow in Section 4.

2 Methodology

This Section focuses on the various methodologies we rely for building the continuous ﬁlter, as well

as the introduction of the framework. First of all, we will describe brieﬂy the feed-forward neural

network and the discrete ﬁlter for a CNN in Section 2.1 and Section 2.2 respectively. As already

mentioned in Section 1, one of the main novelty of the work is building a new framework based on

continuous convolution. Hence, Section 2.3 concerns the introduction of our framework in diﬀerent

settings: single channel, multiple channel and transposed convolution using the continuous ﬁlter.

2.1 Feed-Forward Neural Network

Feed-forward Neural Network, or multi-layer perceptron (MLP), is the most basic, yet one of the

most important, building block of most current deep learning architectures [16, 13, 5]. Widely

used in deep learning, MLPs have the ability to approximate any continuous function due to the

universal approximation theorem [20, 8, 29]. More technically, given an input vector x∈Rnin and a

xnin

ˆy1

ˆynout

Input

layer

Hidden

layer 1

Hidden

layer 2

Output

layer

Figure 2: Schematic structure of Feed-Forward Neural Network.

function to approximate ϕ:Rnin →Rnout ; the MLP approximation is done using a parameterised

function class F={fθ∈Θ}, where θare trainable parameters of the network, belonging to the

parameters’ space Θ. A MLP can be represented as a directed acyclic graph, as depicted in Figure

2. In particular, it is composed by an input layer, an output layer and a certain number of hidden

layers, where the processing units of network, called neurons, perform the computation. Each layer

i, with i∈0, . . . , M , can be thought as a function f(i)belonging to F, and the overall network

function is given by the layers’ composition [9]:

f=f(M)◦f(M−1) ◦ · · · ◦ f(1) ◦f(0).(2.1.1)

Hence, a single layer i, is a function f(i):Rni→Rni+1 , where ni, represents the number of

neurons in layer i, with n0=nin and nM+1 =nout. Each layer iis composed by θi= (w(i),b(i))

parameters, where w(i)is a real matrix ni+1 ×ni, called weight matrix, and b(i)is a real vector of

dimension ni+1, called bias. The output vector h(i+1) of layer i, corresponding to input vector of

layer i+ 1 (except for the output layer), is then calculated using:

h(i+1) =f(i)(h(i)|θi) = δ(i)(w(i)·h(i)+b(i)),(2.1.2)

where h(0) =x, and h(M+1) =ˆy is the output of the network. The function δ(i):Rni→Rni+1 is

called activation, introducing non-linearity through the network; common choices are represented

by the ReLU function, the sigmoid, the logistic function or the radial activation functions. By

using Equation 2.1.1 and Equation 2.1.2, one can express mathematically a MLP architecture.

During the training process, in which a data-set D={(xi, ϕ(x)i)}n

i=1 composed by nobservation

is fed into the network, the MLP parameters θare modiﬁed in order to minimize a loss function

L(θ| D, f). The choice of the loss function depends on the speciﬁc problem of application [16, 25,

5]. Hence, the learning phase can be summarised mathematically as:

min

θL(θ| D, f).(2.1.3)

In practice, to solve the minimization problem, diﬀerent optimization algorithms based on back-

propagation can be used, see [35, 45, 51] for further reference. The optimization phase is done in

multiple training epochs, i.e. a complete repetition of the parameter update involving the complete

training data-set D.

2.2 Discrete ﬁlter in Convolutional Neural Networks

Convolutional Neural Network (CNN) is a class of deep learning architectures, vastly applied in

computer vision [34, 26, 40, 22]. Over the past years, diﬀerent CNN architectures have been

presented, for instance AlexNet [27], ResNet [17], Inception [46], VGGNet [42]. Diﬀerently from

MLPs, in which aﬃne transformations are performed for learning, a convolutional layer actually

performs the convolution of the input data Iand the so called convolutive ﬁlter K, such that

(I ∗ K)(x) = Z∞

−∞

I(x+τ)K(τ)dτ.(2.2.1)

CNNs perform such convolution1in a discrete setting, using a tensorial representation of the two

functions Iand Kinstead of their continuous formulation. Thus, discrete correlation is computed

as (I ∗K)(x) = P∞

τ=−∞ I(x+τ)K(τ), with x,τ∈Zd(with ddimensions), where the latter inﬁnite

summation can be truncated by discarding the null products. In this way, it is not necessary to

know the original function I, but its evaluation at discrete coordinates. In this context, the ﬁlter

Kcan be represented as the tensor K∈RN1×···×Ndsuch that the element Ki1,...,id≡ K(i1, . . . , id)

with ij∈ {1, . . . , Nj},∀j∈ {1, . . . , d}. Applying a similar representation also for the input, the

convolution results in the sum of the element-wise multiplication between input and ﬁlter, as

sketched in Figure 3. The convolution is of course repeated for all the input components, by

moving the ﬁlter across the input in a regularized fashion [16, 5].

0111000

0011100

0001110

0001100

0011000

0110000

1100000

∗

1 0 1

0 1 0

1 0 1

1 4 3 4 1

l2 4 3 3

1 2 3 4 1

1 3 3 1 1

3 3 1 1 0

Figure 3: Discrete convolution operation on one dimensional tensor.

The ﬁlter components (the so-called weights) represent the trainable parameters of the convo-

lutional layer, which are tuned during the training phase. In general, convolution reduces the size

of a (multidimensional) array, performing downsampling. Conversely, the opposite transformation

to downsampling, called upsampling, used by many deep learning architectures, e.g. autoencoders,

uses transposed convolution. The interested reader might refer to [12, 52] for more information

regarding discrete (transposed) convolution.

2.3 Continuous ﬁlter

In contrast to discrete convolution as described in the previous Section, continuous two-dimensional

convolution is mathematically deﬁned as:

Iout(x, y) = ZXZY

I(x+τx, y +τy)· K(τx, τy)dτxdτy,(2.3.1)

where K:X × Y → Ris the continuous ﬁlter function, and I: Ω ⊂R2→Ris the input function.

The continuous ﬁlter function is approximated using a MLP, thus trainable during the training

phase. In order to maintain the parallelism with discrete convolution in CNNs, the deﬁnition

adopted for continuous convolution diﬀers from the mathematical one for which X=Y=R. In

fact, the continuous ﬁlter presented is deﬁned on a close domain, smaller than the input function

domain, as in the case of the discrete ﬁlter. The integral in Equation 2.3.1 can be evaluated

1In many deep learning implementations the term convolution indicates what is known in mathematics as cross-

correlation [16]. In this text, the term convolution will be used to indicate cross-correlation, thus adapting to the

deep learning community convention.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AContinuousConvolutionalTrainableFilterforModellingUnstructuredDataDarioCoscia∗1,LauraMeneghetti†1,NicolaDemo‡1,GiovanniStabile§2,1,andGianluigiRozza¶11MathematicsArea,mathLab,SISSA,viaBonomea265,I-34136,Trieste,Italy2DepartmentofPureandAppliedSciences,InformaticsandMathematicsSection,UniversityofUr...

展开>> 收起<<

A Continuous Convolutional Trainable Filter for Modelling Unstructured Data Dario Coscia1 Laura Meneghetti1 Nicola Demo1 Giovanni Stabile21 and.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Continuous Convolutional Trainable Filter for Modelling Unstructured Data Dario Coscia1 Laura Meneghetti1 Nicola Demo1 Giovanni Stabile21 and

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: