algorithms exclusively designed for mixed-type data.
Second, the attacker should also keep the generated ad-
versarial examples to be seemingly benign. In other words,
adversarial examples should be close to the distribution of
original clean data samples. It is hard to achieve this goal
since features in mixed-type data are usually highly correlated.
For example, in Home Credit Default Risk Dataset [8] where
the task is to predict a person’s qualification for a loan, the
feature “age” of an applicant is always strongly related to
other numerical features (such as “number of children”) and
categorical features (such as “family status”). In this dataset,
if an attacker perturbs the feature “family status” from “child”
to “parent” for an 18-year-old loan applicant, the perturbed
sample obviously deviates from the true distribution of clean
data samples (because there are rarely 18-year-old parents in
reality). As a result, the generated adversarial example can
be easily detected as “abnormal” samples, by an defender
that applies Out-of-Distribution (OOD) detection (or Anomaly
Detection) methods [9]–[11], or detected by human experts
who can judge the authentication of data samples based on
their domain knowledge. Thus, the attacker should generate
adversarial examples, which do not significantly violate the
correlation between any pair of numerical features, as well as
any pair of categorical features, or the pair of categorical and
numerical features.
In this paper, to tackle the aforementioned challenges, we
proposed M-Attack, which is a novel attacking framework
for mixed-type data. In detail, we first transform the searching
space of mixed-type adversarial examples into a unified contin-
uous space (see Figure 1). To achieve this goal, we convert the
problem of finding adversarial categorical perturbations into
the problem to find the probabilistic (categorical) distribution
which the adversarial categorical features are sampled from.
Therefore, we are facilitated to find sparse adversarial exam-
ples in this unified continuous space, by simultaneously up-
dating the numerical & categorical perturbation via gradient-
based methods. Furthermore, to generate in-distribution ad-
versarial examples, we propose a Mix-Mahalanobis Distance
to measure and regularize the distance of an (adversarial)
example to the distribution of the clean mixed-type data
samples. Through extensive experiments, we verify that: (1)
M-Attack can achieve better attacking performance compared
to representative baselines, which are the combination of ex-
isting numerical & categorical attack methods; (2) M-Attack
can achieve better (or similar) efficiency compared to these
baseline methods; and (3) Mix-Mahalanobis Distance can
help the generated adversarial examples be close to the true
distribution of clean data. Our contribution can be summarized
as follows:
•We propose an efficient and effective algorithm, M-
Attack, to generate adversarial examples for mixed-type
data.
•We propose a novel measurement, Mixed Mahalanobis
Distance, which helps the generated adversarial examples
be close to the true distribution.
•We conduct extensive experiments to validate the feasi-
bility of M-Attack, and demonstrate the vulnerability of
popular classification models against M-Attack.
II. RELATED WORKS
There has been a rise in the importance of the adversarial
robustness of machine learning models in recent years. Most
of the existing works, including evasion attack, and poison
attack, focus on continuous input space, especially in the
image domain [1], [12], [13]. In the image domain, because
the data space (which is the space of pixel values) is con-
tinuous, people use gradient-based methods [1], [5] to find
adversarial examples. Based on the gradient attack methods,
various defense methods, such as adversarial training, [1], [14]
are proposed to improve the model robustness. Meanwhile,
adversarial attacks focusing on discrete input data, like text
data, which have categorical features, are also starting to catch
the attention of researchers. In the natural language processing
domain, the work [15] discusses the potential to attack text
classification models such as sentiment analysis, by replacing
words with their synonyms. The study [16] proposes to modify
the text token based on the gradient of input one-hot vectors.
The method [17] develops a scoring function to select the most
effective attack and a simple character-level transformation to
replace projected gradient or multiple linguistic-driven steps
on text data. In the domain of graph data learning, there
are methods [18] that greedily search the perturbations to
manipulate the graph structure to mislead the targeted models.
In a conclusion, when the data space is discrete, these methods
share a similar core idea by applying searched-based methods
to find adversarial perturbations. Although there are well-
established methods for either numerical domain or categorical
domain separately, there is still a lack of studies in mixed-type
data. However, mixed-type data are widely existing in various
machine learning tasks in the physical world. For example,
for the fraudulent transaction detection systems [19], [20] of
credit card institutes, the transaction records from cardholders
may include features such as transaction amount (numerical)
and the type of the purchased product (categorical). Similarly,
for ML models applying in AI healthcare [21], [22], i.e., in
epidemiological studies [23], [24], the data can be collected
from surveys which ask the respondents’ information, includ-
ing their age, gender, race, the type of medical treatment
and the expenditure amount of each type of medical supplies
that are used. In recommender systems [25]–[27] in online-
shopping websites which are built for product recommenda-
tions, the data can include the purchase history of the clients
or the property of the products, both containing numerical and
categorical features. In this paper, we focus on the problem of
how to slightly perturb the input (mixed-type) data sample of a
model to mislead the model’s prediction. This study can have
great significance to help us understand the potential risk of
ML models under malicious perturbations in the applications
mentioned above.