•We propose CAAD , a novel method for AD which uti-
lizes contrastive learning and generative adversarial networks
(GAN). We demonstrate that our proposed model is able to
significantly outperform state-of-the-art (SOTA) models on
AD in wireless networks and standard datasets. To the best of
our knowledge, CAAD is the first model to use a combination
of CL and adversarial learning for AD.
•We propose CAAD-EF , which is another novel model
supplemental to CAAD , which further enables us to incorpo-
rate expert feedback via contrastive learning and uncertainty
quantification using Monte Carlo dropouts. To the best of our
knowledge, our framework is the first successful undertaking
to utilize contrastive learning to incorporate expert feedback.
•Finally, we highlight the importance of various facets
of CAAD-EF through rigorous qualitative, quantitative, and
ablation analyses.
II. RELATED WORK
Many ML approaches have been developed for anomaly
detection across diverse applications. The recent resurgence
of deep learning techniques demonstrating their effectiveness
across a wide variety of domains has lead to the develop-
ment of many novel and powerful modeling paradigms like
generative adversarial networks (GAN) [2], self-supervised
representation learning [3] and contrastive learning (CL) [4].
Contrastive Learning (CL) imposes structure on the latent
space by encouraging similarity in representations learned
for related instances and dissimilarity in representations for
unrelated instances. Such techniques have proven effective,
especially when combined with self-supervised learning [5],
[6] and also with labeled data [7]. CL has demonstrated
promising results in image recognition tasks. However, most
of these efforts focus on improving representation learning
performance on traditional classification tasks and do not
specifically focus on AD. Generative Adversarial Networks
(GANs) [2] are a powerful generative learning paradigm
grounded in an adversarial training setup. However, they
are fraught with training instability. Recently, improvements
have been proposed to stabilize the GAN training setup by
employing Wasserstien distance functions [8] and gradient
penalties on the learned weights.
Deep Learning for Anomaly Detection: The aforementioned
developments in deep learning have led to techniques such as
autoencoders and GANs being employed for the ubiquitous
and challenging problem of AD. Specifically, in [9], a deep
robust autoencoder (Robust AE) model is proposed, inspired
by the Robust Principal Component Analysis technique, for
AD with noisy training data. However, this methodology by
design requires knowledge of a subset of anomalies during
model training and may be considered semi-supervised, and
is not directly related to our context of unsupervised AD.
Recently, another line of AD research [10] proposes employing
DCGAN [2] for unsupervised AD. The authors then build
upon their previous work to propose fAnoGAN [11], a two-
step encoder-decoder architecture based on DCGANs where
the encoder (trained separately) learns to invert the mapping
learned by the Generator (i.e., decoder) of the DCGAN model.
We employ fAnoGAN as one of the baselines for empirical
comparison.
Contrastive Learning for Anomaly Detection: There are
multiple reports of contrastive learning being utilized for AD.
Masked Contrastive Learning [12] is a supervised method that
varies the weights of different classes in the contrastive loss
function to produce good representations that separate each
class. Even though this method shows promise, it requires
knowledge of anomaly labels. Contrasting Shifted Instances
(CSI) [13] and Mean Shifted Contrastive Loss [14] are two
unsupervised AD methods based on CL. CSI investigates the
power of self-supervised CL for detecting out-of-distribution
(OOD) data by using distributionally shifted variations of input
data. We employ CSI as one of our baselines. Mean Shifted
Contrastive Loss applies a contrastive loss modified using
the mean representation on representations generated using
models pre-trained on ImageNet data. However, this model is
not useful for wireless AD as it is pre-trained on a particular
kind of data. Also, none of these methods provide a means to
incorporate expert feedback.
Incorporating Expert Feedback: The solutions presented
in [15]–[19] all employ human feedback in various ways.
Active Anomaly Discovery (AAD) [15] is designed to op-
erate in an anomaly exploration loop where the algorithm
selects data to be presented to experts and also provides
a means to incorporate feedback into the model. However,
its performance is dependent on the number of feedback
loops that can be afforded. Hence, such a method could
not be applied to wireless AD where the volume of input
data is really high. RAMODO [17], combines representation
learning and outlier detection in a single objective function.
It utilizes pseudo labels generated by other state-of-the-art
outlier detection methods and Chebyshev’s inequality. This
dependence on other methods to generate pseudo labels can
sometimes be unreliable in cases where state-of-the-art outlier
detection methods perform poorly. SAAD [16], DevNet [18]
and DPLAN [19] are semi-supervised methods, all of which
require minimal labeled anomalies and are not suitable for our
problem.
The advantage of using contrastive learning for AD is that
it can be utilized in a self-supervised setup. That is, we can
augment the training samples to generate anomalous samples
that are very close to the training distribution and utilize them
as negative samples in contrastive loss. This allows our model
to detect unseen anomalies effectively. Also, the penultimate
layer of the GAN discriminators has recently been shown
to act as good representations of the input data [11], [20]–
[22]. Hence, the combination of these powerful techniques,
CL and GAN serve well for our AD task. None of the related
approaches outlined above have developed AD techniques that
combine the aforementioned techniques for AD. Also, none of
the state-of-the-art related AD approaches provide a means to
incorporate expert feedback via contrastive learning.