Towards Global Neural Network Abstractions with Locally-Exact Reconstruction

2025-05-06 0 0 790.79KB 38 页 10玖币
侵权投诉
Towards Global Neural Network Abstractions with
Locally-Exact Reconstruction
Edoardo Maninoa,, Iury Bessab, Lucas Cordeiroa
aUniversity of Manchester, Department of Computer Science,
Oxford Road, Manchester M13 9PL, United Kingdom
bFederal University of Amazonas, Department of Electricity,
Avenida General Rodrigo Octavio Jorao Ramos, 1200 - Coroado I,
Manaus - AM, 69067-005, Brazil
Abstract
Neural networks are a powerful class of non-linear functions. However, their
black-box nature makes it difficult to explain their behaviour and certify
their safety. Abstraction techniques address this challenge by transforming
the neural network into a simpler, over-approximated function. Unfortu-
nately, existing abstraction techniques are slack, which limits their applica-
bility to small local regions of the input domain. In this paper, we propose
Global Interval Neural Network Abstractions with Center-Exact Reconstruc-
tion (GINNACER). Our novel abstraction technique produces sound over-
approximation bounds over the whole input domain while guaranteeing exact
reconstructions for any given local input. Our experiments show that GIN-
NACER is several orders of magnitude tighter than state-of-the-art global
abstraction techniques, while being competitive with local ones.
Keywords: Neural Networks, Abstract Interpretation, Global Abstraction.
1. Introduction
Neural networks have been applied to a large number of predictive tasks.
The main reason for their versatility is their ability to approximate a large
Corresponding author
Email addresses: edoardo.manino@manchester.ac.uk (Edoardo Manino),
iurybessa@ufam.edu.br (Iury Bessa), lucas.cordeiro@manchester.ac.uk (Lucas
Cordeiro)
Preprint submitted to Neural Networks April 3, 2023
arXiv:2210.12054v2 [cs.LG] 31 Mar 2023
class of nonlinear functions [1, 2]. However, at the same time, trained neural
networks tend to have a black-box nature, which limits our ability to explain
their behaviour [3]. More worryingly, even when we have a priori formal
expectations on their behaviour, e.g., robustness to input noise, the act of
certifying whether a neural network satisfies our requirements comes at a
very high computational cost [4, 5].
A common strategy in dealing with the complexity of analysing neural
networks is transforming them into equivalent models described by better-
behaved classes of functions. Examples from theoretical works include max-
affine splines [6], residual networks with one neuron per layer [7], prototypical
“sawtooth” rectified linear unit (ReLU) networks [8, 9] and Radon or ridgelet
integral representations [10, 11]. However, the purpose of these transforma-
tions is to prove properties over whole classes of network architectures, e.g.,
the existence of approximating networks with given error bounds, rather than
providing a constructive algorithm that outputs such a network.
In stark contrast with these works, there exist a variety of approximate
techniques that can compress the size of a given neural network. Pruning
techniques can remove network parameters whose contribution is negligi-
ble [12, 13]. Distillation techniques train a simpler network to approximate
the output of the hidden layers of the original network [14]. Quantisation
techniques reduce the accuracy of arithmetic operations to take advantage
of hardware acceleration [15]. Frequently, these techniques are combined to
minimise the computational cost of executing a neural network in terms of
power, latency or memory [16, 17]. However, they offer no formal guarantees
on the approximation errors introduced during the compression process.
Ideally, we would like to compress any given neural network while keeping
formal guarantees on the original model. A recent development in such di-
rection is lossless compression [18, 19, 20]. The algorithms in this family can
identify which ReLU activation functions in the network are always active or
inactive and remove them accordingly. Unfortunately, lossless algorithms can
only produce equivalent networks; thus, their compression scope is somewhat
limited [21].
More promising techniques can be grouped under the umbrella term of
abstractions [22, 23]. An abstraction is a set-valued mapping, i.e., a function
whose output is a set instead of a point. In particular, this output set is
guaranteed to be an over-approximation of the original and more complex
function. More formally, for a given abstraction’s domain of validity, the im-
age of the original function is always within the abstraction’s output set [24].
2
Depending on that domain of validity, we discriminate between local and
global abstractions.
On the one hand, local abstractions are valid for a small region of the
input domain only [25, 26, 27, 28]. As such, they are suitable for verifying
the robustness of neural networks to local perturbations. Typically, their
design is neuron-centric: each activation function is over-approximated by
bounding its output between a set of linear [29] or quadratic constraints [26].
Still, these abstractions’ main goal is not to reduce the network complexity
but to encode it into a set of constraints that a solver can handle.
On the other hand, global abstractions are valid for any input [30, 31].
In this regard, Prabhakar and Afzal [30] propose to transform the original
neural network so that it can operate in interval arithmetic [32] and main-
tain an upper and lower bound on the value of each neuron. Furthermore, by
over-approximating the individual intervals, the authors can merge similar
neurons and reduce the overall complexity of the global abstraction func-
tion. Similarly, Elboher et al. [31] introduce a global abstraction algorithm
that creates multiple copies of each neuron, and then merge them together.
The advantage of their algorithm is that the abstraction function is itself an-
other neural network, which grants a high degree of integration with existing
verification tools [33].
A significant drawback of local and global abstraction techniques is their
slackness [34, 31]. In general, there is a trade-off between the complexity of
the abstracted neural network and the size of the over-approximation margin
it produces. In practice, even abstracting a few neurons may generate large
output sets that become meaningless. This phenomenon is not surprising:
established techniques from control theory, like Taylor approximations [35]
and Bernstein polynomials [36], are known to suffer from the same over-
approximation explosion when applied to highly non-linear functions.
This paper proposes an algorithm to compute Global Interval Neural
Network Abstractions with Center-Exact Reconstruction (GINNACER). Our
approach to the complexity-approximation trade-off is novel: we focus on
guaranteeing exact reconstruction for a given input centroid while keeping
sound over-approximation bounds elsewhere. This way, we obtain tighter
global abstractions in a specific local region of the input. Empirically, our
GINNACER abstractions are orders of magnitude tighter than state-of-the-
art global abstractions while on par with local abstractions.
More specifically, our contributions are the following:
3
We introduce the Inactive Canonical Form (ICF), an exact transforma-
tion of ReLU networks where all activation functions are inactive for a
specific input centroid.
We introduce a layer-wise abstraction that can handle interval inputs
and outputs and uses fewer ReLU activations than the original layer.
We introduce the GINNACER algorithm, which computes an individ-
ual abstraction for each layer of the network. The abstracted layers are
aligned so that their output intervals have size zero for a specific input
centroid.
We compare GINNACER with state-of-the-art global and local ab-
stractions. Our comparison is not dependent on a downstream task
(e.g. robustness verification) like in previous studies but measures the
quality of the abstraction in isolation.
The rest of this paper is organized as follows. In Section 2 we clarify the
differences between GINNACER and the existing literature. In Section 3 we
introduce a motivating example and formally state the GINNACER require-
ments. In Section 4, we present our GINNACER algorithm and provide proof
of its theoretical properties. In Section 5, we compare GINNACER against
the state-of-the-art and study its empirical properties. Finally, in Section 6,
we discuss the potential uses of GINNACER, the connections with related
research fields and a roadmap for future work. The data, neural networks
and code of our experiments is freely available at [37].
2. Related Work
Due to the functional complexity of modern neural networks, there have
been many efforts to cast them to a simpler class of functions. The pur-
pose and application of these neural transformation methods are very dif-
ferent in nature. Here, we propose a high-level taxonomy that focuses on
the mathematical relationship between the original neural network fand its
transformed version g.
First, we have neural network approximations, where f(x)g(x). In this
category, neural network compression methods are the most prominent [16,
17]. Their goal is to reduce the computational requirements of the neural
network at inference time. These methods include pruning individual weights
4
or neurons [12, 13], executing the neural network in integer arithmetic or
any similar quantized representation [15], or training a smaller network to
mimic the hidden representations of the original one [14]. A more recent
development in the approximation category comes from privacy-preserving
machine learning, where it has been proposed to approximate the non-linear
activation functions of a neural network with high-degree polynomials [38,
39]. This approximation allows inference on private inputs via homomorphic
encryption. In general, neural network approximations give no guarantees
on the approximation error and are usually evaluated in terms of loss of
predictive accuracy concerning the original network.
Second, we have equivalent transformations, where f(x) = g(x),x. On
the one hand, some of these efforts are theoretical in nature, and their pur-
pose is proving some general mathematical properties of neural networks [6].
On the other hand, there have been efforts towards constructing lossless com-
pression algorithms for neural networks [18, 19, 20]. These algorithms can
identify the neurons whose activation function state is either active or in-
active for all possible inputs x. Then, these neurons can be safely replaced
with a linear function or removed altogether. However, only a minority of
the neurons in the original network can be processed in this way.
Third, we have global functional abstractions, where g(x)f(x)
g(x),x. The main challenge is maintaining a valid upper and lower bound
for any xin the input domain. Prabhakar et al. [30] achieve this by al-
lowing the entire neural network to operate in interval arithmetic. Instead,
Elboher et al. [31] propose to keep explicit copies of each neuron that carry
the upper and lower bound information. Both methods can merge neurons
with similar weights, thus yielding a pair of functions g, g that contains fewer
non-linearities. However, the functions g, g produced by the aforementioned
methods tend to be quite slack, as we show in our experiments (see Sec-
tions 3.1 and 5.3). In contrast, GINNACER yields tighter abstractions while
maintaining global soundness.
Fourth, we have local functional abstractions, where g(x)f(x)g(x)
for a small sub-domain x∈ D. These have the advantage that they only need
to represent a small portion of the original function f. Traditionally, non-
linear systems have been analysed with Taylor models [40], which give a local
Taylor polynomial approximation and a corresponding worst error interval.
This corpus of techniques has been applied to neural networks with a fair de-
gree of success [35, 41]. In contrast, more recent advances in local functional
abstractions for neural networks have been driven by linear techniques. For
5
摘要:

TowardsGlobalNeuralNetworkAbstractionswithLocally-ExactReconstructionEdoardoManinoa,,IuryBessab,LucasCordeiroaaUniversityofManchester,DepartmentofComputerScience,OxfordRoad,ManchesterM139PL,UnitedKingdombFederalUniversityofAmazonas,DepartmentofElectricity,AvenidaGeneralRodrigoOctavioJord~aoRamos,12...

展开>> 收起<<
Towards Global Neural Network Abstractions with Locally-Exact Reconstruction.pdf

共38页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:38 页 大小:790.79KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 38
客服
关注