versarial loss function [31]. A generative component is at-
tached to the discriminative network to generate more sim-
ilar feature spaces for source and target domains in the
adversarial generative models [10]. Although adversarial
models achieve significant improvement, the training pro-
cess of adversarial networks is complicated that often re-
sults very fragile convergence [1]. Unlike the adversarial
networks, a shared feature space has been learned by min-
imizing the distribution discrepancy between source and
target domains in discrepancy-based methods [24, 39, 29].
The main advantage of discrepancy-based methods is they
are easy to train and convergence is not as fragile as ad-
versarial networks [6]. However, most discrepancy-based
methods minimize the discrepancy between source and tar-
get domains and very few of them address discrepancy
within domains [40, 14, 35]. Especially when there is a
low correlation between labeled and unlabeled samples in
the target domain which results only partial alignment with
the target feature space [35]. As a result, discrepancy-based
methods can not guarantee distribution matching and often
suffer from over-fitting to the target domain[2].
In this paper, we introduce a novel training concept
called simultaneous learning to address above shortcom-
ings. Aligning to a fixed source feature space can be prob-
lematic because of the presence of an intra-domain discrep-
ancy within the target domain [16]. To address this is-
sue, we train both source and target network simultaneously
where both feature spaces align one another in the course of
learning (Figure 1). We implement simultaneous learning
with an auto-encoder-based domain adaptation where fea-
ture spaces of source and target domain get aligned in the
bottleneck layer. In this framework, first, the simultane-
ous learning scheme allows the source domain to align with
the target domain along with the intra-domain discrepan-
cies present in the target domain. Concurrently, the target
domain also gets aligned with the source domain with its
intra-domain discrepancies. As a result, after learning iter-
atively, both source and target domain feature spaces align
with each other including their intra-domain discrepancies,
and solve the problem of partial alignment. Second, we use
the bottleneck layer of auto-encoder as feature space rather
than the layers in the classification network used in other
methods. As the learning objective of the auto-encoder is
not to classify samples, this enables a smooth transition and
continuous feature space among the classes. If any feature
discrepancy presents within the classes, continuous feature
spaces help them to align completely with the other domain.
To best of our knowledge, our proposed method is a first-
of-its-kind framework that considers a simultaneous learn-
ing scheme with source and target networks with a carefully
modified MMD loss across the domain invariant feature
space. Our proposed framework is both easy to train and
achieve state-of-the-art performance by solving the problem
of partial alignment. key contributions:
•Propose a novel simultaneous learning scheme to help tar-
get feature space to align with the source domain feature
space for SSDA.
•Modify MMD loss function in a way which works with si-
multaneous learning and achieve strong inter-domain align-
ment.
•Propose an auto-encoder-based SSDA framework to ad-
dress intra-domain discrepancy issue by implementing the
simultaneous learning scheme and evaluate performances
with state-of-the-art models using three datasets .
2. Related Works
Unsupervised domain adaptation (UDA) has become
mostly popular recently that can be categorized in three ap-
proaches. First one is to reduce the distribution discrep-
ancy by adversarial learning [4] [36] [7] [37] which has
few variances, such as utilization of adversarial learning
to introduce an intermediate domain approach with Grad-
ual Vanishing Bridge (GVB) layer between source and tar-
get domain [7]. The second type of method minimizes the
domain divergence with different loss functions, such as
MMD [24], CORAL [30], HoMM [2] etc. Recently, Do-
main Conditioned Adaptation Network (DCAN) [21] uses
domain conditioned channel attention mechanism to iden-
tify the domain specific features. The last type of method
uses optimal transport (OT) mechanism which minimizes
the cost of transporting one feature distribution to another
domain [3] [8][38][20]. Semi-supervised domain adapta-
tion (SSDA) fuses the well-labeled source domain with the
label scarce target domain with the help of few labeled tar-
get samples. Kim et al. [16] introduce the intra-domain
discrepancy between the labeled and unlabeled data in the
target domain and minimize it to align source and target do-
main features. Another approach uses meta-learning frame-
work for multi-source and semi-supervised domain adap-
tation [19]. BiAT framework uses bidirectional adversarial
training to guide the adversarial examples across the domain
gap and achieve domain alignment [15]. Another adversar-
ial learning based approach MME uses conditional minmax
entropy optimization to achieve adaptation [27].
3. Proposed Method
Let the labeled source domain is Ds={(Xs
i, Y s
i)}ns.
Here Xs
i={x1, ..., xns} ∈ Rdsis the ith sample from the
source image with a feature space of dsdimension. Simi-
larly, Ys
i∈ Ysis the class label of corresponding ith sam-
ple of the source images Xs
i, where Ys={1,2, ..., nc}.
Now, for our semi-supervised setup let consider the target
domain Dtas union of two subset, labeled target domain,
Dland unlabeled target domain Du. So, Dt=Dl∪ Du=
{(Xl
i, Y l
i)}nl∪{(Xu
i}nuwhere Xl
i∈Rdtis the labeled ith