Fast OT for Latent Domain Adaptation
Siddharth Roheda?, Ashkan Panahi†, Hamid Krim?
?Electrical and Computer Engineering Department, North Carolina State University
{sroheda, ahk}@ncsu.edu
†Dept. of Computer Science and Engineering, Chalmers University
ashkan.panahi@chalmers.se
Abstract—In this paper, we address the problem of unsuper-
vised Domain Adaptation. The need for such an adaptation
arises when the distribution of the target data differs from
that which is used to develop the model and the ground truth
information of the target data is unknown. We propose an
algorithm that uses optimal transport theory with a verifiably
efficient and implementable solution to learn the best latent
feature representation. This is achieved by minimizing the cost
of transporting the samples from the target domain to the
distribution of the source domain.
Index Terms—Optimal Transport, Unsupervised Domain
Adaptation
I. INTRODUCTION
Adapting a classifier trained on a source domain to rec-
ognize instances from a new target domain is an important
problem of increasing research interest [?], [1], [2]. Difficulties
often arise in practice, as is the case when the data is different
from that which is used to train a model. Specifically, consider
an inference problem where a model is learned using a certain
source domain Xswith the corresponding labels Ysand is
used to classify samples from the target domain Xtwith the
corresponding labels Yt. Domain adaptation is required when
P(Ys|Xs)≈P(Yt|Xt), but P(Xs)is significantly different
from P(Xt).
Such a shift in data distribution is seen and addressed in
almost every field ranging from Natural Language Processing
(NLP) to Object Recognition. Given labeled samples from
a source domain, there are two groups that any Domain
Adaptation (DA) approach can be classified into, i) semi-
supervised DA: some samples in the target domain are labeled
or ii) unsupervised DA: none of the samples in the target
domain are labeled.
Several works [4]–[6] have demonstrated the effects of
the divergence between the probability distributions of do-
mains.These works have led to solutions of transforming the
data from the target domain so as to make the associated
distribution as close as possible to that of the source domain.
This allows the application of the classifier trained on the
source domain to classify data from the target domain post
transformation. In [12] an approach for multi-source domain
adaptation was proposed to transfer knowledge learned from
multiple labeled sources to a target domain by aligning mo-
ments of their feature distributions, while [13] uses a GAN
to learn the transformation from the target domain to source
Identify applicable funding agency here. If none, delete this.
domain. In [14], [15], the authors simply align the second
order statistics of the source and target domains.
Contributions: In this paper, we address the problem of
unsupervised DA. We build on the existing works having led
to various techniques including recent generative adversarial
networks [7], to rather propose Optimal Transport for some
of its advantages as a viable path to adapt the model toward
classifying the target domain data. We first seek the latent
representations of source and target domains to subsequently
minimize the optimal transport cost. These representations
for the source and target can be classified using a common
classifier trained on the source data. Furthermore, we also
demonstrate that it is also crucial to ensure optimal perfor-
mance that P(ˆ
Ys|Xs)≈P(ˆ
Yt|Xt), where ˆ
Ysand ˆ
Ytare the
predictions made by the classifier on the source and target
domain respectively.
II. RELATED WORK
A. Generative modeling
The Generative Adversarial Network was first introduced by
Goodfellow et al. [7] in 2014. In this framework, a generative
model is pitted against an adversary: the discriminator. The
generator aims to deceive the discriminator by synthesizing
realistic samples from some underlying distribution. The dis-
criminator on the other hand, attempts to discriminate between
a real data sample and that from the generator. Both models are
approximated by neural networks. When trained alternatively,
the generator learns to produce random samples from the data
distribution which are very close to the real data samples.
Following this, Conditional Generative Adversarial Networks
(CGANs) were proposed in [8]. These networks were trained
to generate realistic samples from a class conditional distribu-
tion, by replacing the random noise input to the generator by
some useful information. As a result, the generator now aims
to generate realistic data samples, when given the conditional
information. CGANs have been used to generate random faces
when given facial attributes [9] as well as to produce relevant
images given text descriptions [10].
Many works have recently attempted to use GANs for
performing domain adaptation. In [13] the authors use the
generator to learn the features for classification and the
discriminator to differentiate between the source and target
domain features produced by the generator. Figure 1 depicts
the block diagram for this approach. In [16] a cyclic GAN was
used to perform image translation between unpaired images.
arXiv:2210.00479v1 [cs.LG] 2 Oct 2022