a priori pairing knowledge is considered. Purely un-
supervised algorithms are designed for scenarios where
neither pairings between domains nor any other side-
information is available. As a consequence, they rely
solely on the particular topology of each domain to in-
fer inter-domain similarities (e.g. [6, 10, 11, 34]).
Methods that leverage some additional information
are often categorized as semi-supervised MA. As a spe-
cial case, several methods consider partial correspon-
dence information, where a few one-to-one matching
samples work as anchor points to find a consistent align-
ment for the rest of the data. Some papers leverage the
graph structure of the data [12,18,19,33] and are closely
related to Laplacian eigenmaps [5]. Others resort to
neural networks such as the GAN-based MAGAN [2] or
the autoencoder presented in [3].
However, even partial correspondences can be ex-
pensive or impossible to acquire. This is the case in
biological applications where the measurement process
destroys the cells, making it impossible to measure other
modalities of the exact same cells. But even if there are
no known correspondences between domains, we do not
have to resort to unsupervised MA. If we have access to
side information about the datasets from both domains,
such as discrete class labels, we can leverage this extra
knowledge to perform manifold alignment [31, 35, 36].
Motivated by this, we propose a new semi-supervised
MA algorithm called MALI (Manifold Alignment with
Label Information). MALI leverages the manifold struc-
ture of the data in both domains, combined with the
discrete label information, and it does not require any
known corresponding points in the different domains.
MALI is built upon the widely-used manifold learn-
ing method Diffusion Maps [9] and optimal transport
(OT) [27]. We show via experimentation that MALI
outperforms current state-of-the-art MA algorithms in
this setting across multiple datasets by several metrics.
The setting described above is similar to the domain
adaptation (DA) [14] problem. In traditional machine
learning, the training set and the test set are assumed to
be sampled from the same distribution and to share the
same features. But in practice these assumptions may
not hold, for example, due to the different collection
circumstances mentioned previously. When data is ex-
pensive or time-consuming to label, it may be desirable
to train a model on existing related datasets and then
adapt it to the new task. It is of interest to leverage
the knowledge acquired from training on one dataset to
improve performance on the same task on a different
dataset, or potentially even a different task. One pos-
sible approach to tackle DA is to use MA, since knowl-
edge can be transferred through MA via the learned
inter-domain correspondences or by training on a shared
latent representation of both domains.
2 Preliminaries
2.1 Problem Description Assume we have two
datasets X={x1, x2, . . . , xn} ∈ Rn×pand Y=
{y1, y2, . . . , ym} ∈ Rm×q. We assume that all of the
points in Xare labeled with discrete (i.e. class) labels
Lx={`x
1, . . . , `x
n}while the points in Ymay be partially
or fully labeled with discrete labels Ly={`y
1, . . . , `y
r},
with r≤m. In the domain adaptation problem, Xis
measured from the source domain while Yis measured
from the target domain.
The problem consists of learning an alignment be-
tween both data manifolds, by leveraging their respec-
tive geometric structures as well as the label knowledge
available from both domains. There are several pos-
sible ways to represent such an alignment using MA
algorithms. One way is to directly learn hard or soft
correspondences between points in Xand Y. A regres-
sion model could then be trained using these correspon-
dences to learn a parametric mapping between domains.
In the domain adaptation problem, unlabeled data in
the target domain can be labeled using the more rich
label information using the regression model.
A second way to represent the alignment is to
learn a shared embedding space which can be used
for downstream analysis. For the domain adaptation
problem, a classifier could be trained on the shared
embedding space using the labels from X. Direct
correspondences can be learned by, for example, using
a nearest neighbor approach in the shared space.
As we show in this work, MALI is suited for any
of these scenarios. We first find pairwise cross-domain
distances, which are then leveraged to find hard or soft
assignments between the domains via optimal transport.
If required, a shared embedding can be learned using
these assignments.
2.2 Related work Here we summarize two existing
methods that perform manifold alignment using discrete
label information without assuming prior known corre-
spondences. In [35] both datasets are concatenated in a
new block matrix
Z=X0
0Y∈R(n+m)×(p+q).
Domain-specific similarity matrices WXand WYare cre-
ated from the data, e.g. via a kernel function. These
matrices are then similarly combined in a new block ma-
trix WZ. To leverage the label information, the authors
create a label-similarity matrix with entries Ws(i, j) = 1
if samples ziand zjshare the same label, 0 otherwise. A
label dissimilarity matrix Wd(i, j) = |Ws(i, j)−1|is also