On the Robustness of Deep Clustering Models Adversarial Attacks and Defenses Anshuman Chhabra Ashwin Sekhari and Prasant Mohapatra

2025-05-02 1 0 4.11MB 14 页 10玖币

侵权投诉

On the Robustness of Deep Clustering Models:

Adversarial Attacks and Defenses

Anshuman Chhabra∗, Ashwin Sekhari*, and Prasant Mohapatra

Department of Computer Science

University of California

Davis, CA 95616

{chhabra,asekhari,pmohapatra}@ucdavis.edu

Abstract

Clustering models constitute a class of unsupervised machine learning methods

which are used in a number of application pipelines, and play a vital role in

modern data science. With recent advancements in deep learning– deep clustering

models have emerged as the current state-of-the-art over traditional clustering

approaches, especially for high-dimensional image datasets. While traditional

clustering approaches have been analyzed from a robustness perspective, no prior

work has investigated adversarial attacks and robustness for deep clustering models

in a principled manner. To bridge this gap, we propose a blackbox attack using

Generative Adversarial Networks (GANs) where the adversary does not know

which deep clustering model is being used, but can query it for outputs. We

analyze our attack against multiple state-of-the-art deep clustering models and real-

world datasets, and ﬁnd that it is highly successful. We then employ some natural

unsupervised defense approaches, but ﬁnd that these are unable to mitigate our

attack. Finally, we attack

Face++

, a production-level face clustering API service,

and ﬁnd that we can signiﬁcantly reduce its performance as well. Through this

work, we thus aim to motivate the need for truly robust deep clustering models.

1 Introduction

Clustering models are utilized in many data-driven applications to group similar samples together, and

dissimilar samples separately. They constitute a powerful class of unsupervised Machine Learning

(ML) models which can be employed in many cases where labels for data samples are either hard (or

impossible) to obtain. Note that there are a multitude of different approaches to accomplishing the

aforementioned clustering task, and an important differentiation can be made between traditional and

deep clustering approaches.

Traditional clustering generally aims to minimize a clustering objective function deﬁned using a given

distance metric [

]. These include approaches such as k-means [

], DBSCAN [

], spectral methods

[

], among others. Such models generally fail to perform satisfactorily on high-dimensional data (i.e.,

image datasets), or incur huge computational costs that make the problem intractable [

]. To improve

upon these approaches, initial deep clustering models sought to decompose the high-dimensional data

to a cluster-friendly low-dimensional representation using deep neural networks. Clustering was then

undertaken on this latent space representation [

]. Since then, deep clustering models have become

considerably advanced, with state-of-the-art models outperforming traditional clustering models by

signiﬁcant margins on a number of real-world datasets [8, 9].

Despite these successes, deep clustering models have not been sufﬁciently analyzed from a robustness

perspective. Recently, traditional clustering models have been shown to be vulnerable to adversarial

∗Equal contribution.

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.01940v1 [cs.LG] 4 Oct 2022

attacks that can reduce clustering performance signiﬁcantly [

]

. However, no such adversarial

attacks exist for deep clustering methods. Furthermore, no work investigates generalized blackbox

attacks, where the adversary has zero knowledge of the deep clustering model being used. This is

the most realistic setting under which a malicious adversary could aim to disrupt the working of

these models. While there is a multitude of work in this domain for supervised learning models, deep

clustering approaches have not received the same attention from the community.

The closest work to ours proposes robust deep clustering [

], by retraining models with adver-

sarially perturbed inputs to improve clustering performance. However, this line of work has many

shortcomings: 1) it lacks fundamental analysis on attacks speciﬁc to deep clustering models (for

e.g., the state-of-the-art robust deep clustering model RUC [

] only considers images perturbed via

FGSM/BIM [

] attacks, which are common attack approaches for supervised learning), 2) no

clearly deﬁned threat models for the adversary are proposed

, and 3) there is no transferability [

]

analysis. Thus in this work we seek to bridge this gap by proposing generalized blackbox attacks that

operate at the input space under a well-deﬁned adversarial threat model. We also conduct empirical

analysis to observe how effectively adversarial samples transfer between different clustering models.

Original

Adversarial

pre-attack

cluster

post-attack

cluster

bird cat dog deer horse bird dog ship

ship ship plane dog bird dog bird bird

Figure 1: Adversarial samples generated by our attack (ﬁrst 4 image

pairs from the left correspond to SPICE and the others to RUC).

We utilize Generative Adversarial Net-

works (GANs) [

] for our attack, in-

spired by previous approaches (Adv-

GAN [

], AdvGAN++ [

], etc) for

supervised learning. We also utilize

a number of defense approaches (es-

sentially deep learning based anomaly

detection [

] and state-of-the-art "ro-

bust" deep clustering models [

]) to

determine if our adversarial samples

can be mitigated by an informed de-

fender. One of the major ﬁndings of our work is that these approaches are unable to prevent our

adversarial attacks. Through our work, we seek to promote the development of better defenses for

adversarial attacks against deep clustering.

Finally, to truly showcase how powerful our attacks can be, we attack a production-level ML-as-a-

Service (MLaaS) API that performs a clustering task (and possibly utilizes deep clustering models in

the backend). We ﬁnd that our attack can also signiﬁcantly reduce the functioning of such MLaaS

clustering API services. To summarize, we make the following contributions:

•

We propose the ﬁrst blackbox adversarial attack against deep clustering models. We show that our

attacks can signiﬁcantly reduce the performance of these models while requiring a minimal number

of queries. We also undertake a transferability analysis to demonstrate the magnitude of our attack.

•

We undertake a thorough experimental analysis of most state-of-the-art (SOTA) deep clustering

models, on a number of real-world datasets, such as CIFAR-10 [

], CIFAR-100 [

], and STL-10

[22] which shows that our attack is applicable across a number of models and datasets.

•

We show that existing (unsupervised) defense approaches (such as anomaly detection and robustness

via adversarial retraining) cannot thwart our attack samples, thus prompting the need for better defense

approaches for adversarial attacks against deep clustering models.

•

We also attack a production-level MLaaS clustering API to showcase the extent of our attack. We

ﬁnd that our attack is highly successful for this real-world task, underscoring the need for making

deep clustering models truly robust.

Figure 1 shows some of the adversarial samples generated by our attack for the SPICE [

] and RUC

[

] deep clustering models on the STL-10 dataset, and their corresponding pre-attack and post-attack

predicted cluster/class labels

. To the human eye, these samples appear indistinguishable from each

These attacks cannot be used for deep clustering because: 1) they employ computationally intensive

optimizers that use exhaustive search and hence, make the attack too expensive for high-dimensional data, 2) they

are designed for traditional clustering and are training-time attacks, but, deep clustering models are deployed

frozen for inference, requiring a test-time attack.

For example, for RUC [

], it is implicitly assumed that the adversary has knowledge of the dataset as well

as ground truth labels, and will attack using supervised whitebox attacks.

4Class labels are inferred by taking the majority from the ground truth labels for samples in that cluster.

other, but the model clusters them incorrectly after the attack. As we will show in later sections,

our attack on SPICE for STL-10 results in a 83.9% (67.9% for RUC) decrease in clustering utility

measured according to the NMI [23] metric5.

2 Related Works

Deep Clustering.

To leverage clustering algorithms on high-dimensional data, early work on deep

clustering [

], aimed to learn a latent low-dimensional cluster-friendly representation that could

then be clustered on. This task was achieved using AutoEncoders (AEs) [

], Variational

AEs [

], or GANs [

–

]. More recently, current SOTA deep clustering models employ

self-supervised and contrastive learning instead of the earlier AE/GAN based approaches to perform

clustering [8, 9, 31]. We consider these in the paper due to their superlative performance.

Training-time Adversarial Attacks Against Clustering.

Recently, a few works have proposed

blackbox adversarial attacks against clustering [

]. These seek to poison a small number of

samples in the input data, so that when clustering is undertaken on the poisoned dataset, other

unperturbed samples change cluster memberships. This problem is signiﬁcantly different from ours–

the traditional clustering algorithm retrains on the poisoned input data constituting an attack at

training time. For deep clustering, the model does not train again once deployed, so we have to

generate adversarial images that the model misclusters at test time. There have also been whitebox

attacks proposed for traditional clustering, but these can only be employed for the speciﬁc traditional

clustering algorithm considered [32–34].

Attacks/Defenses in Supervised Learning.

The closest attacks to ours in supervised learning are

score-based blackbox attacks where the adversary has access to the softmax probabilities outputted by

the model [

]. A number of such attacks have been proposed: NES [

], SPSA [

], among others.

These cannot be applied in their original formulation as the attack optimization expects ground truth

reference labels. Even with simple modiﬁcations to the loss (such as using predicted labels as ground

truth) these attacks do not work as successfully as our proposed GAN attack (refer to Appendix F.1

for a detailed discussion and Appendix E for empirical justiﬁcation). Similar issues make supervised

defenses inapplicable for our setting– defense strategies need to be truly unsupervised to work in the

clustering context, and outputs of deep clustering models might not possess a one-to-one mapping

with ground truth labels, causing further problems. We provide a more detailed discussion on the

inapplicability of these defense methods in Appendix F.2.

Robust Deep Clustering.

A natural in-processing defense approach to preventing adversarial attacks

on deep neural networks is to utilize some form of adversarial retraining which seeks to jointly

optimize an adversarial objective along with the original loss function used to train the network. This

technique has been shown to vastly improve the adversarial robustness of traditional deep neural

networks [

]. In the context of deep clustering, to the best of our knowledge, there are two

works that aim to make models robust to adversarial noise in this manner

: RUC [

] and ALRDC

[

]. Moreover, RUC is considered to be the SOTA robust deep clustering model. Even in terms

of performance metrics (NMI, ACC, ARI) it is second only to the SOTA model SPICE [43].

3 Preliminaries and Notation

In this section, we introduce notation and preliminary knowledge regarding the clustering task and

the proposed adversarial attack. Note that for a matrix

of size

u×v

, we can index into row

Ai,∀i∈[u]

, and index into a single value at row

and column

Ai,j ,∀i∈[u], j ∈[v]

. Moreover,

||.|| denotes the Euclidean (2-norm) norm of a vector.

Deep Clustering.

Since we are proposing blackbox attacks in our paper, we will be deﬁning deep

clustering models in a more generalized manner that abstracts their inner functioning. We defer

the reader to [

] for more details on the models analyzed in the paper. A deep clustering model is

denoted as

and operates on samples of the given dataset

consisting of

samples and maps them

Similar results for SPICE hold for CIFAR-100 with 78.9% (58.2% for RUC) decrease in NMI, and for

CIFAR-10 with 72.9% (68.7% for RUC) decrease in NMI.

[

–

] refer to "robustness" in their work but their deﬁnition is not the traditional notion of adversarial

robustness and they do not incorporate adversarial retraining.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

OntheRobustnessofDeepClusteringModels:AdversarialAttacksandDefensesAnshumanChhabra,AshwinSekhari*,andPrasantMohapatraDepartmentofComputerScienceUniversityofCaliforniaDavis,CA95616{chhabra,asekhari,pmohapatra}@ucdavis.eduAbstractClusteringmodelsconstituteaclassofunsupervisedmachinelearningmethodswhi...

展开>> 收起<<

On the Robustness of Deep Clustering Models Adversarial Attacks and Defenses Anshuman Chhabra Ashwin Sekhari and Prasant Mohapatra.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

On the Robustness of Deep Clustering Models Adversarial Attacks and Defenses Anshuman Chhabra Ashwin Sekhari and Prasant Mohapatra

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: