Partially Oblivious Neural Network Inference Panagiotis Rizomiliotis Christos Diou Aikaterini Triakosia Ilias Kyrannas and Konstantinos Tserpes

2025-05-02 0 0 550.24KB 12 页 10玖币
侵权投诉
Partially Oblivious Neural Network Inference
Panagiotis Rizomiliotis, Christos Diou, Aikaterini Triakosia,
Ilias Kyrannas and Konstantinos Tserpes
Department of Informatics and Telematics,
Harokopio University of Athens
Omirou 9, 17778, Athens, Greece
{prizomil,cdiou,ktriakos,kyrannas,tserpes}@hua.gr
Abstract
Oblivious inference is the task of outsourcing a ML
model, like neural-networks, without disclosing critical
and sensitive information, like the model’s parameters.
One of the most prominent solutions for secure obliv-
ious inference is based on a powerful cryptographic
tools, like Homomorphic Encryption (HE) and/or
multi-party computation (MPC). Even though the im-
plementation of oblivious inference systems schemes
has impressively improved the last decade, there are
still significant limitations on the ML models that they
can practically implement. Especially when both the
ML model and the input data’s confidentiality must
be protected. In this paper, we introduce the notion of
partially oblivious inference. We empirically show that
for neural network models, like CNNs, some informa-
tion leakage can be acceptable. We therefore propose a
novel trade-off between security and efficiency. In our
research, we investigate the impact on security and in-
ference runtime performance from the CNN model’s
weights partial leakage. We experimentally demon-
strate that in a CIFAR-10 network we can leak up to
80% of the model’s weights with practically no secu-
rity impact, while the necessary HE-mutliplications are
performed four times faster.
1 Introduction
Artificial intelligence (AI), and in particular, machine
learning (ML) technology is transforming almost every
business in the world. ML provides the ability to ob-
tain deep insights from data sets, and to create models
that outperform any human or expert system in crit-
ical tasks, like face recognition, medical diagnosis and
financial predictions. Many companies offer such ML-
based operations as a service (Machine learning as a
service, MLaaS). MLaaS facilitates clients to benefit
from ML models without the cost of establishing and
maintaining an inhouse ML system. There are three
parties involved in the transaction; the data owner,
the model owner and the infrastructure provider.
However, the use of ML models raises crucial secu-
rity and privacy concerns. The data set used for the
ML model training and/or the MLaaS client’s input
in the inference phase can leak sensitive personal or
business information. To complete the scenery of se-
curity threats, in several applications, like medical or
financial, the ML models are considered as the MLaaS
provider’s intellectual property, and they must be pro-
tected.
Oblivious inference is the task of running a ML
model without disclosing the client’s input, the model’s
prediction and/or by protecting the ownership of the
trained model. This field of research is also referred to
as Privacy-preserved machine learning (PPML).
Several solutions for oblivious inference have been
proposed that utilize powerful cryptographic tools, like
Multi-party Computation (MPC) primitives and the
Homomorphic Encryption (HE) schemes. MPC based
protocols facilitate the computation of an arbitrary
function on private input from multiple parties. These
protocols have significant communication overhead as
they are very interactive. On the other hand, using HE
cryptography we are able to perform computations on
encrypted data, but with significant computation and
storage overhead.
Several PPML schemes have been proposed that are
either based solely on one of these technologies or that
they leverage a combination of them (hybrid schemes).
So far, literature has focused on two attack models. It
is assumed that either the model owner is also the in-
frastructure provider or that the ML model that it is
used, it is publicly known. This is a reasonable choice,
as in both cases, the ML model’s weights can be used
in plaintext form. That is that, the schemes designers
avoid expensive computations between ciphertexts and
thus, they introduce inference systems that are practi-
cal.
In this paper, we consider the use cases in which the
ML model’s confidentiality must be protected. The
service provider wants to outsource the ML prediction
computation (for instance to a cloud provider or to an
edge device). However, the ML model constitutes in-
tellectual property and it’s privacy must be preserved.
Protecting both the client’s input data and the
model’s privacy can increase prohibitively the compu-
tational complexity as all the computations must be
performed between encrypted data. Just as a rough
estimation, the runtime of a single HE multiplication
increases ten times when it is performed between two
ciphertexts compared to HE multiplication between a
1
arXiv:2210.15189v1 [cs.CR] 27 Oct 2022
plaintext and a ciphertext. At the same time, HE mul-
tiplications between encrypted data (ciphertexts) in-
crease significantly the accumulative level of noise and
they limit the applicability of the HE schemes. Thus,
they must be avoided when possible.
Building on this observation, we introduce the no-
tion of partially oblivious (PO) inference. In a PO in-
ference system, the ML model owner decides to leak
some of the model’s weights in order to improve the
efficiency of the inference process. PO inference can
be seen as a generalization of oblivious inference that
offers a trade-off between security and efficiency. The
PO inference systems lie between the two extreme use
cases, the most secure but the less efficient in which
all the ML model weights are encrypted and the less
secure and the most efficient in which all the weights
are revealed. The optimal point of equilibrium between
efficiency and security depends on the use case.
Our work is summarized as follows:
1. We introduce the notion of Partially Oblivious in-
ference for ML models.
2. We provide a security definition for the evaluation
of the information leakage impact. In our analysis,
the attacker is passive (”honest-but-curious”) and
she aims to compute a model that simulates the
protected one as accurately as possible. We use
accuracy improvement as our security metric.
3. As a proof-of-concept use case, we apply the no-
tion of PO inference to protect Convolutional Neu-
ral Networks (CNN) inference.
4. We experimentally measure the security and per-
formance trade-off. We use two models trained
with the MINST [15] and CIFAR-10 datasets [14],
respectively. For the PO inference implementa-
tion, Gazelle-like [12] approach is used. Impres-
sively, it is shown that in some scenarios, leakage
of more than 80% of the model weight’s can be
acceptable.
The paper is organized as follows. In Section 2, the
necessary background is provided. In Section 3, we an-
alyze our motivation, we introduce the security attack
model and the security definition for PO inference and
we demonstrate the application of the PO inference to
CNN models. Finally, in Section 4, we implement and
evaluate the two CNN models and in Section 5, we
conclude the paper.
1.1 Related work
There are several lines of work for PPML systems that
leverage advanced cryptographic tools, like MPC and
HE. The most promising solutions are hybrid, and they
are using HE to protect the linear and MPC to protect
the non-linear layers.
CryptoNet ([10]) is the first scheme that deploys the
HE primitive for PPML on the MNIST benchmark. In
the same research line, CHET [9], SEALion [22] and
Faster Cryptonets [7] use HE and retrained networks.
There are HE based schemes that use pre-trained net-
works, like Chimera [4] and Pegasus [16]. In the pre-
trained PPML category, we can find several propos-
als that use only MPC schemes, like ABY3 [18], and
XONN [20].
The most promising type of PPML systems are hy-
brid, i.e. the proposals that use both MPC and the HE
schemes. Hybrid HE-MPC schemes provide an elegant
solution for pre-trained networks. The MPC is respon-
sible for the non-linear part (activation function) and
HE for the linear transformations (FC and convolu-
tional layers). Gazelle [12] is a state-of-the-art hybrid
scheme for CNN prediction and several works have fol-
lowed, like Delphi [17], nGraph-HE [3], nGraph-HE2
[2], and PlaindML-HE [5]. All these schemes assume
that either the model owner runs the models locally or
that the ML model is publicly known.
There are several open source HE libraries that im-
plement the operations of a HE schemes and offer
higher-level API [23] and there is an ongoing effort to
standardize APIs for HE schemes [1]. However, deal-
ing directly with HE-libraries and operations is still
a very challenging task for the developers. In order
to facilitate developers work, HE compilers have been
proposed to offer a high-level abstraction. There is a
nice overview of existing HE-compilers in [23].
2 Background
2.1 Homomorphic Encryption
In the last decade, the performance of HE schemes has
impressively improved up to several orders of magni-
tude thanks to advances in the theory and to more
efficient implementations. However, it is still signifi-
cantly slower than plaintext computations, while real-
izing HE-based computations is complex for the non-
expert.
Modern HE schemes belong into one of two main cat-
egories. The schemes that compute logical gates and
thus, they are most efficient for generic applications,
and the schemes that operate nicely on arithmetic or
p-ary circuits and thus, they are used for the evalu-
ation of polynomial functions. The CKKS [6] scheme
belongs in the second category. As it operates to arith-
metic circuits on complex and real approximate num-
bers, CKKS is suitable for machine learning applica-
tions. We are going to use it in our experiments.
Following the last version of the HE Standard [1], all
the schemes must support the following types of oper-
ations: key and parameters management, encryption
and decryption operation, HE evaluation of additions
and multiplications, and noise management.
2.2 HE evaluation operations cost
Practically, all the modern HE schemes are based on
the hardness of the Learning With Errors (LWE) prob-
lem [19] and its polynomial ring variant. Depending on
the scheme the plaintext, the keys, and the ciphertexts
are elements of Zn
qor Zq[X]/(Xn+ 1), i.e. they are
2
either vectors of integers or polynomials with integer
coefficients.
In order to protect a message ma randomly selected
vector (or polynomial) eis selected from a distribution
and it is added to produce a noisy version of m. The
level Bof this added noise must always be between
two bounds Bmin and Bmax. When B < Bmin, the
ciphertext cannot protect the message, while when B >
Bmax, the noise cannot be removed and the correct
message cannot be retrieved anymore.
Thus, it is crucial to manage the level of noise in-
duced by the HE operations. It has been demonstrated
that the best noise management approach is to treat
the ciphertext’s noise level Bas an invariant. That is
that, after each HE operation, the level of noise must
remain close to B.
In the CKKS [6] scheme, the ciphertext is a pair
of polynomials c= (c0, c1) over a ring of polynomials
Zq[X]/(XN+ 1), for appropriately selected integers q
and N. The four main evaluation operations of CKKS
scheme are summarized as:
1. Plaintext-Ciphertext Addition
Let mand m0be two plaintexts and c0= (c0
0, c0
1)
be the encrypted value of m0. The output of the
addition is coutput = (m+c0
0, c0
1) and decrypts to
m+m0and the noise level is B.
2. Ciphertext Addition
Let c= (c0, c1) and c0= (c0
0, c0
1) be the encrypted
values of plaintexts mand m0. The output of the
addition is coutput = (c0+c0
0, c1+c0
1) and it is the
ciphertext of m+m0(approximately with good
accuracy). The level of noise is upper bounded by
2B.
3. Plaintext-Ciphertext multiplication
Let mand m0be two plaintexts and c0= (c0
0, c0
1) be
the encrypted value of m0. The output coutput =
(m·c0
0, m ·c0
1) decrypts to m·m0and the level of
noise is mB.
4. Ciphertext multiplication
Let c= (c0, c1) and c0= (c0
0, c0
1) be the encrypted
values of plaintexts mand m0. The output of the
multiplication is three polynomials, coutput = (c0·
c0
0, c0·c0
1+c0
0·c1, c1·c0
1) and the noise level is B2.
It is clear that the ciphertext multiplication is the
problematic one. The number of ciphertext polynomi-
als increases linearly (one more polynomial after each
multiplication) and the noise level increases exponen-
tially (it becomes B2L, after Lconsecutive multipli-
cations). To manage this size and noise increase, two
refresh type operations are applied. In order to bring
the dimension of the output ciphertext back to two,
the relinearlization algorithm is used. The resulting
ciphertext c00
output is an encryption of the m·m0and
the level of noise is B2. For the noise management, an
algorithm called rescale (or modulo switching in other
HE schemes) is used. However, it can be applied only
a limited and predetermined number of times, usually
equal to the multiplicative depth Lof the arithmetic
circuit.
Both algorithms, rescaling and relinearization, are
costly in terms of computational complexity and both
of them are applied after each multiplication between
two ciphertexts. Relinearization has approximately the
same computational cost with ciphertext multiplica-
tion and an evaluation key is required. The evalua-
tions keys are created by the encryptor and passed to
the evaluator.
To summarize, the HE multiplication between
cipehrtexts is a very costly operation in terms of com-
putational overhead and noise management. Com-
pared to ciphertext multiplication, the other three HE
evaluation operations are practically for free.
2.3 Plaintext Packing
One of the main features of some HE schemes that ex-
tremely improve performance is plaintext packing (also
referred to as batching). It allows several scalar values
to be encoded in the same plaintext. Thus, for schemes
with cyclotomic polynomial of degree N, it is possible
to store up to N/2 values in the same plaintext (we
refer to them as slots). Thus, homomorphic operations
can be performed component-wise in Single Instruc-
tion Multiple Data (SIMD) manner. This encoding
has several limitations, since there is not random ac-
cess operation and only cyclic rotations of the slots is
allowed.
There are various choices for plaintext packing in
ML, i.e. how the input data and the model weights are
organized in plaintexts (or ciphertexts). Depending on
the workload two are the main packing approaches,
batch-axis-packing and inter-axis packing.
The batch-axis-packing is used by CryptoNets,
nGraph-HE and nGraph-HE2. It is used to a 4D tensor
of shape (B, C, H, W), where Bis the batch size, Cis
number of channels and H,Wthe height and width of
input, along the batch axis. That is that, each plain-
text (or ciphertext) packing has Bslots and C·H·W
are needed. This approach assumes that Binputs are
available for each inference operation.
On the other hand, inter-axis packing is used when
each input is processed separately, i.e. it is not neces-
sary to collect Binputs before performing a prediction
(this is common in medical diagnosis). There are sev-
eral packing choices, all of them encode scalars from
the same input. This approach is used by Gazelle in
which different packing is used for each type of linear
transformation. We will use inter-axis packing in our
analysis. In Section 3.4, we provide more details on the
different inter-axis packing choices.
2.4 CNN models
The neural-network inference has been identified as the
main application area for privacy preserving technol-
ogy, and especially for HE and MPC schemes, as we
have seen in Section 1.1. However, there are practical
limits to the complexity of the use cases that can be
3
摘要:

PartiallyObliviousNeuralNetworkInferencePanagiotisRizomiliotis,ChristosDiou,AikateriniTriakosia,IliasKyrannasandKonstantinosTserpesDepartmentofInformaticsandTelematics,HarokopioUniversityofAthensOmirou9,17778,Athens,Greecefprizomil,cdiou,ktriakos,kyrannas,tserpesg@hua.grAbstractObliviousinferenceist...

展开>> 收起<<
Partially Oblivious Neural Network Inference Panagiotis Rizomiliotis Christos Diou Aikaterini Triakosia Ilias Kyrannas and Konstantinos Tserpes.pdf

共12页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:12 页 大小:550.24KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注