
DYNAMICAL SYSTEMS’ BASED NEURAL NETWORKS
ELENA CELLEDONI∗, DAVIDE MURARI ∗, BRYNJULF OWREN∗, CAROLA-BIBIANE
SCH ¨
ONLIEB†,AND FERDIA SHERRY†
Abstract. Neural networks have gained much interest because of their effectiveness in many
applications. However, their mathematical properties are generally not well understood. If there is
some underlying geometric structure inherent to the data or to the function to approximate, it is often
desirable to take this into account in the design of the neural network. In this work, we start with
a non-autonomous ODE and build neural networks using a suitable, structure-preserving, numerical
time-discretisation. The structure of the neural network is then inferred from the properties of the
ODE vector field. Besides injecting more structure into the network architectures, this modelling
procedure allows a better theoretical understanding of their behaviour. We present two universal
approximation results and demonstrate how to impose some particular properties on the neural
networks. A particular focus is on 1-Lipschitz architectures including layers that are not 1-Lipschitz.
These networks are expressive and robust against adversarial attacks, as shown for the CIFAR-10
and CIFAR-100 datasets.
Key words. Neural networks, dynamical systems, Lipschitz networks, Structure-preserving
deep learning, Universal approximation theorem.
MSC codes. 65L05, 65L06, 37M15
1. Introduction. Neural networks have been employed to accurately solve many
different tasks (see, e.g., [4,12,54,35]). Indeed, because of their excellent approxima-
tion properties, ability to generalise to unseen data, and efficiency, neural networks are
one of the preferred techniques for the approximation of functions in high-dimensional
spaces.
In spite of this popularity, a substantial number of results and success stories in
deep learning still rely on empirical evidence and more theoretical insight is needed.
Recently, a number of scientific papers on the mathematical foundations of neural
networks have appeared in the literature, [9,71,60,61,66,33]. In a similar spirit,
many authors consider the design of deep learning architectures taking into account
specific mathematical properties such as stability, symmetries, or constraints on the
Lipschitz constant [36,31,26,63,22,27,67,34,69,73]. Even so, the imposition of
structure on neural networks is often done in an ad hoc manner, making the resulting
input to output mapping F:X → Y hard to analyse. In this paper, we describe
a general and systematic way to impose desired mathematical structure on neural
networks leading to an easier approach to their analysis.
There have been multiple attempts to formulate unifying principles for the design
of neural networks. We hereby mention Geometric Deep Learning (see e.g. [13,12]),
Neural ODEs (see e.g. [21,47,57,72]), the continuous-in-time interpretation of Re-
current Neural Networks (see e.g. [58,20]) and of Residual Neural Networks (see
e.g. [71,44,18,59,1]). In this work, we focus on Residual Neural Networks (ResNets)
and build upon their continuous interpretation.
Neural networks are compositions of parametric maps, i.e. we can characterise a neu-
∗Department of Mathematical Sciences, NTNU, N-7491 Trondheim, Norway
(elena.celledoni@ntnu.no,davide.murari@ntnu.no,brynjulf.owren@ntnu.no)
†Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Wilber-
force Road, Cambridge CB3 0WA, UK. (cbs31@cam.ac.uk,fs436@cam.ac.uk)
1
arXiv:2210.02373v2 [cs.LG] 31 Aug 2023