VISUAL PROMPTING FOR ADVERSARIAL ROBUSTNESS Aochuan Chenuni22C61Peter Lorenzuni22C62Yuguang Yao1Pin-Yu Chen3Sijia Liu1 1Michigan State University USA

2025-04-26 0 0 1.39MB 6 页 10玖币
侵权投诉
VISUAL PROMPTING FOR ADVERSARIAL ROBUSTNESS
Aochuan Chen,1Peter Lorenz,2Yuguang Yao1Pin-Yu Chen3Sijia Liu1
1Michigan State University, USA
2Fraunhofer ITWM and Fraunhofer Center of Machine Learning, Germany
3IBM Research, USA
ABSTRACT
In this work, we leverage visual prompting (VP) to improve
adversarial robustness of a fixed, pre-trained model at test time.
Compared to conventional adversarial defenses, VP allows
us to design universal (i.e., data-agnostic) input prompting
templates, which have plug-and-play capabilities at test time
to achieve desired model performance without introducing
much computation overhead. Although VP has been success-
fully applied to improving model generalization, it remains
elusive whether and how it can be used to defend against
adversarial attacks. We investigate this problem and show
that the vanilla VP approach is not effective in adversarial
defense since a universal input prompt lacks the capacity for
robust learning against sample-specific adversarial perturba-
tions. To circumvent it, we propose a new VP method, termed
C
lass-wise
A
dversarial
V
isual
P
rompting (C-AVP), to gen-
erate class-wise visual prompts so as to not only leverage
the strengths of ensemble prompts but also optimize their in-
terrelations to improve model robustness. Our experiments
show that C-AVP outperforms the conventional VP method,
with 2.1
×
standard accuracy gain and 2
×
robust accuracy
gain. Compared to classical test-time defenses, C-AVP also
yields a 42
×
inference time speedup. Code is available at
https://github.com/Phoveran/vp-for-adversarial-robustness.
Index Terms
visual prompting, adversarial defense, ad-
versarial robustness
1. INTRODUCTION
Machine learning (ML) models can easily be manipulated
(by an adversary) to output drastically different classifications.
Thereby, model robustification against adversarial attacks is
now a major focus of research. Yet, a large volume of existing
works focused on training recipes and/or model architectures to
gain robustness. Adversarial training (AT) [1], one of the most
effective defense, adopted min-max optimization to minimize
the worst-case training loss induced by adversarial attacks.
Extended from AT, various defense methods were proposed,
ranging from supervised learning to semi-supervised learning,
and further to unsupervised learning [2–11].
Equal contribution.
Although the design for robust training has made tremen-
dous success in improving model robustness [12, 13], it typ-
ically takes an intensive computation cost with poor defense
scalability to a fixed, pre-trained ML model. Towards circum-
venting this difficulty, the problem of test-time defense arises;
see the seminal work in Croce et. al. [14]. Test-time defense
alters either a test-time input example or a small portion of the
pre-trained model. Examples include input (anti-adversarial)
purification [15
17] and model refinement by augmenting the
pre-trained model with auxiliary components [18
20]. How-
ever, these defense techniques inevitably raise the inference
time and hamper the test-time efficiency [14]. Inspired by that,
our work will advance the test-time defense technology by
leveraging the idea of visual prompting (
VP
) [21], also known
as model reprogramming [22–25].
Generally speaking, VP [21] creates a universal (i.e.,data-
agnostic) input prompting template (in terms of input pertur-
bations) in order to improve the generalization ability of a
pre-trained model when incorporating such a visual prompt
into test-time examples. It enjoys the same idea as model
reprogramming [22
25] or unadversarial example [26], which
optimizes a universal perturbation pattern to maneuver (i.e.,
reprogram) the functionality of a pre-trained model towards
the desired criterion, e.g., cross-domain transfer learning [24],
out-of-distribution generalization [26], and fairness [25]. How-
ever, it remains elusive whether or not VP could be designed
as an effective solution to adversarial defense. We will investi-
gate this problem, which we call adversarial visual prompting
(
AVP
) in this work. Compared to conventional test-time de-
fense methods, AVP significantly reduces the inference time
overhead since visual prompts can be designed offline over
training data and have the plug-and-play capability applied to
any testing data. We summarize our contributions as below.
We formulate and investigate the problem of AVP for the
first time and empirically show the conventional data-agnostic
VP design is incapable of gaining adversarial robustness.
·
We propose a new VP method, termed class-wise AVP (
C-
AVP
), which produces multiple, class-wise visual prompts
with explicit optimization on their couplings to gain better
adversarial robustness.
¸
We provide insightful experiments to demonstrate the pros
and cons of VP in adversarial defense.
arXiv:2210.06284v4 [cs.CV] 1 May 2023
2. RELATED WORK
Visual prompting.
Originated from the idea of in-context
learning or prompting in natural language processing (NLP)
[27
30], VP was first proposed in Bahng et. al. [21] for vi-
sion models. Before formalizing VP in Bahng et. al. [21],
the underlying prompting technique has also been devised in
computer vision (CV) with different naming. For example,
VP is closely related to adversarial reprogramming or model
reprogramming [22
24, 31
33], which focused on altering the
functionality of a fixed, pre-trained model across domains by
augmenting test-time examples with an additional (universal)
input perturbation pattern. Unadversarial learning also enjoys
the similar idea to VP. In [26], unadversarial examples that
perturb original ones using ‘prompting’ templates were intro-
duced to improve out-of-distribution generalization. Yet, the
problem of VP for adversarial defense is under-explored.
Adversarial defense.
The lack of adversarial robustness is
a weakness of ML models. Adversarial defense, such as ad-
versarial detection [19, 34
38] and robust training [2, 6, 9, 10,
18, 39], is a current research focus. In particular, adversar-
ial training (AT) [1] is the most widely-used defense strat-
egy and has inspired many recent advances in adversarial
defense [12, 13, 20, 40
42]. However, these AT-type defenses
(with the goal of robustness-enhanced model training) are
computationally intensive due to min-max optimization over
model parameters. To reduce the computation overhead of
robust training, the problem of test-time defense arises [14],
which aims to robustify a given model via lightweight unad-
versarial input perturbations (a.k.a input purification) [15, 43]
or minor modifications to the fixed model [44,45]. In different
kinds of test-time defenses, the most relevant work to ours is
anti-adversarial perturbation [17].
3. PROBLEM STATEMENT
Visual prompting.
We describe the problem setup of VP
following Bahng et. al. [21, 23
25]. Specifically, let
Dtr
denote a training set for supervised learning, where
(x, y)
Dtr
signifies a training sample with feature
x
and label
y
. And
let
δ
be a visual prompt to be designed. The prompted input is
then given by
x+δ
with respect to (w.r.t.)
x
. Different from the
problem of adversarial attack generation that optimizes
δ
for
erroneous prediction, VP drives
δ
to minimize the performance
loss `of a pre-trained model θ. This leads to
minimize
δ
E(x,y)Dtr [`(x+δ;y, θ)]
subject to δC,(1)
where
`
denotes prediction error given the training data
(x, y)
and base model
θ
, and
C
is a perturbation constraint. Following
Bahng et. al. [21, 23, 24],
C
restricts
δ
to let
x+δ[0,1]
for any
x
. Projected gradient descent (PGD) [1, 26] can then
be applied to solving problem (1). In the evaluation,
δ
is
integrated into test data to improve the prediction ability of
θ
.
Adversarial visual prompting.
Inspired by the usefulness of
VP to improve model generalization [21, 24], we ask:
(AVP problem)
Can VP (1) be extended to robustify
θ
against adversarial attacks?
At the first glance, the AVP problem seems trivial if we specify
the performance loss `as the adversarial training loss [1, 2]:
`adv (x+δ;y, θ)=maximize
xxx
`(x+δ;y, θ),(2)
where
x
denotes the adversarial input that lies in the
`
-norm
ball centered at xwith radius >0.
Recall from (1) that the conventional VP requests
δ
to be
universal across training data. Thus, we term universal AVP
(U-AVP) the following problem by integrating (1) with (2):
minimize
δδCλE(x,y)Dtr [`(x+δ;y, θ)]+
E(x,y)Dtr [`adv (x+δ;y, θ)] (U-AVP)
where
λ>0
is a regularization parameter to strike a balance
between generalization and adversarial robustness [2].
Fig. 1
:Example of designing U-AVP
for adversarial defense on (CIFAR-10,
ResNet18), measured by robust accu-
racy against PGD attacks [1] of differ-
ent steps. The robust accuracy of
0
steps is the standard accuracy.
The problem (U-AVP)
can be effectively solved
using a standard min-
max optimization method,
which involves two alter-
nating optimization rou-
tines: inner maximization
and outer minimization.
The former generates ad-
versarial examples as AT,
and the latter produces the
visual prompt
δ
like (1).
At test time, the effective-
ness of
δ
is measured from
two aspects: (1) standard
accuracy, i.e., the accuracy
of
δ
-integrated benign examples, and (2) robust accuracy, i.e.,
the accuracy of
δ
-integrated adversarial examples (against the
victim model
θ
). Despite the succinctness of (U-AVP), Fig. 1
shows its ineffectiveness to defend against adversarial attacks.
Compared to the vanilla VP (1), it also suffers a significant
standard accuracy drop (over
50%
in Fig. 1 corresponding to
0
PGD attack steps) and robust accuracy is only enhanced by a
small margin (around
18%
against PGD attacks). The negative
results in Fig. 1 are not quite surprising since a data-agnostic
input prompt
δ
has limited learning capacity to enable adver-
sarial defense. Thus, it is non-trivial to tackle the problem of
AVP.
4. CLASS-WISE ADVERSARIAL VISUAL PROMPT
No free lunch for class-wise visual prompts.
A direct ex-
tension of (U-AVP) is to introduce multiple adversarial visual
prompts, each of which corresponds to one class in the training
set
Dtr
. If we split
Dtr
into class-wise training sets
{D(i)
tr }N
i=1
摘要:

VISUALPROMPTINGFORADVERSARIALROBUSTNESSAochuanChen†;1PeterLorenz†;2YuguangYao1Pin-YuChen3SijiaLiu11MichiganStateUniversity,USA2FraunhoferITWMandFraunhoferCenterofMachineLearning,Germany3IBMResearch,USAABSTRACTInthiswork,weleveragevisualprompting(VP)toimproveadversarialrobustnessofaxed,pre-trainedmo...

展开>> 收起<<
VISUAL PROMPTING FOR ADVERSARIAL ROBUSTNESS Aochuan Chenuni22C61Peter Lorenzuni22C62Yuguang Yao1Pin-Yu Chen3Sijia Liu1 1Michigan State University USA.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:6 页 大小:1.39MB 格式:PDF 时间:2025-04-26

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注