Towards Generating Adversarial Examples on Mixed-type Data Han Xu

2025-05-06 0 0 875.18KB 10 页 10玖币

侵权投诉

Towards Generating Adversarial Examples on

Mixed-type Data

Han Xu

Michigan State University, East Lansing, Michigan, USA

xuhan1@msu.edu

Zhimeng Jiang

Texas A&M University, College Station, Texas, USA

zhimengj@tamu.edu

Menghai Pan, Huiyuan Chen, Xiaoting Li, Mahashweta Das, Hao Yang

VISA Research, Palo Alto, California, USA

Abstract—The existence of adversarial attacks (or adversarial

examples) brings huge concern about the machine learning

(ML) model’s safety issues. For many safety-critical ML tasks,

such as ﬁnancial forecasting, fraudulent detection, and anomaly

detection, the data samples are usually mixed-type, which contain

plenty of numerical and categorical features at the same time.

However, how to generate adversarial examples with mixed-

type data is still seldom studied. In this paper, we propose a

novel attack algorithm M-Attack, which can effectively generate

adversarial examples in mixed-type data. Based on M-Attack, at-

tackers can attempt to mislead the targeted classiﬁcation model’s

prediction, by only slightly perturbing both the numerical and

categorical features in the given data samples. More importantly,

by adding designed regularizations, our generated adversarial

examples can evade potential detection models, which makes the

attack indeed insidious. Through extensive empirical studies, we

validate the effectiveness and efﬁciency of our attack method and

evaluate the robustness of existing classiﬁcation models against

our proposed attack. The experimental results highlight the

feasibility of generating adversarial examples toward machine

learning models in real-world applications.

I. INTRODUCTION

The existence of adversarial attacks (or adversarial ex-

amples) [1]–[4] has brought huge concerns when applying

machine learning (ML) models to safety-critical tasks. In many

ML applications in Web Service or mobile applications, the

data inputs are often mixed type, which contains both nu-

merical and categorical features simultaneously. For example,

for an online ﬁnancial institute, ML models are trained to

evaluate whether a loan applicant has the ability to repay

his/her loan. In this scenario, the data has numerical features,

e.g., the applicant’s age, account balance, and annual income,

as well as categorical features including his/her educational

background, occupation type, and marital status. Similarly,

in recommender systems for online shopping, the data also

contains both numerical and categorical information, such as

the price and categories of the recommended items. However,

how to conduct adversarial attacks for mixed-type data still

lacks full exploration. Therefore, in this paper, we focus on the

problem of how to generate adversarial examples for mixed-

type data. Speciﬁcally, we aim to solve the problem: given a

well-trained ML model, how can we perturb a data sample

with an unnoticeable distortion, such that this ML model’s

prediction is misled to give a wrong prediction? The studied

problem is crucial in practice. Recall the above ﬁnancial

institute example: if an unqualiﬁed applicant provides a fake

proﬁle that contains a few fraudulent records so that he/she

can fool the trained ML model to get the approval of the loan,

it will cause a huge cost to this ﬁnancial institute.

To achieve this attacking goal, we face tremendous chal-

lenges. First, the attacker is expected to modify as few

features of the original (clean) data as possible to keep the

perturbation behavior unnoticeable and insidious. This requires

the perturbation added on the original sample to be sparse in

the input data space. Notably, there are existing methods for

sparse adversarial attacks in either numerical or categorical

data domains, separately. In the numerical domain, Projected

Gradient Descent methods (PGD) are adopted to guide the

search of adversarial examples and project the perturbation

into a continuous l1-norm bounded space to stress the sparsity

of perturbation [1], [5]. In the categorical domain, search-

based methods [6], [7] are employed to iteratively ﬁnd the

top-Kcategorical features, which have the largest inﬂuence

to model prediction, and then search optimal perturbation for

these Kfeatures. However, our task involves both the numer-

ical and categorical features and there are still no conﬁrmative

methods to generate the optimal sparse adversarial attack over

the targeted searching space, which is a Cartesian product of a

discrete space (for categorical features) and continuous space

(for numerical features). Moreover, our experimental results in

Section V suggest that simply combining existing strategies

usually provide sub-optimal solutions. For example, for an

algorithm which ﬁrst applies the search-based methods [6],

[7] to perturb categorical features, and then applies l1-PGD

method [1], [5] to ﬁnd numerical perturbations, it cannot

successfully ﬁnd strongest adversarial examples (or leading

to the maximal loss value of the targeted classiﬁer). This

fact highlights the necessity to devise new adversarial attack

arXiv:2210.09405v1 [cs.LG] 17 Oct 2022

algorithms exclusively designed for mixed-type data.

Second, the attacker should also keep the generated ad-

versarial examples to be seemingly benign. In other words,

adversarial examples should be close to the distribution of

original clean data samples. It is hard to achieve this goal

since features in mixed-type data are usually highly correlated.

For example, in Home Credit Default Risk Dataset [8] where

the task is to predict a person’s qualiﬁcation for a loan, the

feature “age” of an applicant is always strongly related to

other numerical features (such as “number of children”) and

categorical features (such as “family status”). In this dataset,

if an attacker perturbs the feature “family status” from “child”

to “parent” for an 18-year-old loan applicant, the perturbed

sample obviously deviates from the true distribution of clean

data samples (because there are rarely 18-year-old parents in

reality). As a result, the generated adversarial example can

be easily detected as “abnormal” samples, by an defender

that applies Out-of-Distribution (OOD) detection (or Anomaly

Detection) methods [9]–[11], or detected by human experts

who can judge the authentication of data samples based on

their domain knowledge. Thus, the attacker should generate

adversarial examples, which do not signiﬁcantly violate the

correlation between any pair of numerical features, as well as

any pair of categorical features, or the pair of categorical and

numerical features.

In this paper, to tackle the aforementioned challenges, we

proposed M-Attack, which is a novel attacking framework

for mixed-type data. In detail, we ﬁrst transform the searching

space of mixed-type adversarial examples into a uniﬁed contin-

uous space (see Figure 1). To achieve this goal, we convert the

problem of ﬁnding adversarial categorical perturbations into

the problem to ﬁnd the probabilistic (categorical) distribution

which the adversarial categorical features are sampled from.

Therefore, we are facilitated to ﬁnd sparse adversarial exam-

ples in this uniﬁed continuous space, by simultaneously up-

dating the numerical & categorical perturbation via gradient-

based methods. Furthermore, to generate in-distribution ad-

versarial examples, we propose a Mix-Mahalanobis Distance

to measure and regularize the distance of an (adversarial)

example to the distribution of the clean mixed-type data

samples. Through extensive experiments, we verify that: (1)

M-Attack can achieve better attacking performance compared

to representative baselines, which are the combination of ex-

isting numerical & categorical attack methods; (2) M-Attack

can achieve better (or similar) efﬁciency compared to these

baseline methods; and (3) Mix-Mahalanobis Distance can

help the generated adversarial examples be close to the true

distribution of clean data. Our contribution can be summarized

as follows:

•We propose an efﬁcient and effective algorithm, M-

Attack, to generate adversarial examples for mixed-type

data.

•We propose a novel measurement, Mixed Mahalanobis

Distance, which helps the generated adversarial examples

be close to the true distribution.

•We conduct extensive experiments to validate the feasi-

bility of M-Attack, and demonstrate the vulnerability of

popular classiﬁcation models against M-Attack.

II. RELATED WORKS

There has been a rise in the importance of the adversarial

robustness of machine learning models in recent years. Most

of the existing works, including evasion attack, and poison

attack, focus on continuous input space, especially in the

image domain [1], [12], [13]. In the image domain, because

the data space (which is the space of pixel values) is con-

tinuous, people use gradient-based methods [1], [5] to ﬁnd

adversarial examples. Based on the gradient attack methods,

various defense methods, such as adversarial training, [1], [14]

are proposed to improve the model robustness. Meanwhile,

adversarial attacks focusing on discrete input data, like text

data, which have categorical features, are also starting to catch

the attention of researchers. In the natural language processing

domain, the work [15] discusses the potential to attack text

classiﬁcation models such as sentiment analysis, by replacing

words with their synonyms. The study [16] proposes to modify

the text token based on the gradient of input one-hot vectors.

The method [17] develops a scoring function to select the most

effective attack and a simple character-level transformation to

replace projected gradient or multiple linguistic-driven steps

on text data. In the domain of graph data learning, there

are methods [18] that greedily search the perturbations to

manipulate the graph structure to mislead the targeted models.

In a conclusion, when the data space is discrete, these methods

share a similar core idea by applying searched-based methods

to ﬁnd adversarial perturbations. Although there are well-

established methods for either numerical domain or categorical

domain separately, there is still a lack of studies in mixed-type

data. However, mixed-type data are widely existing in various

machine learning tasks in the physical world. For example,

for the fraudulent transaction detection systems [19], [20] of

credit card institutes, the transaction records from cardholders

may include features such as transaction amount (numerical)

and the type of the purchased product (categorical). Similarly,

for ML models applying in AI healthcare [21], [22], i.e., in

epidemiological studies [23], [24], the data can be collected

from surveys which ask the respondents’ information, includ-

ing their age, gender, race, the type of medical treatment

and the expenditure amount of each type of medical supplies

that are used. In recommender systems [25]–[27] in online-

shopping websites which are built for product recommenda-

tions, the data can include the purchase history of the clients

or the property of the products, both containing numerical and

categorical features. In this paper, we focus on the problem of

how to slightly perturb the input (mixed-type) data sample of a

model to mislead the model’s prediction. This study can have

great signiﬁcance to help us understand the potential risk of

ML models under malicious perturbations in the applications

mentioned above.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TowardsGeneratingAdversarialExamplesonMixed-typeDataHanXuMichiganStateUniversity,EastLansing,Michigan,USAxuhan1@msu.eduZhimengJiangTexasA&MUniversity,CollegeStation,Texas,USAzhimengj@tamu.eduMenghaiPan,HuiyuanChen,XiaotingLi,MahashwetaDas,HaoYangVISAResearch,PaloAlto,California,USAAbstractTheexiste...

展开>> 收起<<

Towards Generating Adversarial Examples on Mixed-type Data Han Xu.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Towards Generating Adversarial Examples on Mixed-type Data Han Xu

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: