WASSERSTEIN ARCHETYPAL ANALYSIS 3
in the same way regardless of their spatial closeness, whereas a Wasserstein approach places a
larger focus on outliers that are farther from the bulk of the dataset. More generally, one could
also consider p-Wasserstein metrics, where larger choices of pwould further penalize the distance
from the dataset. In this way, the choice of metric or statistical divergence is problem dependent
and encodes modeling assumptions about the structure of the dataset.
When d= 1, there is a closed form solution for the 2-Wasserstein metric, and it is straightforward
to directly compute the unique minimizer of (WAA), as previously done in related work by Cuesta-
Albertos, Matr´an, and Rodr´ıguez-Rodr´ıguez [7]. (See below for a discussion of this and other
related previous works.) Our first main result is that, when d= 2, a solution of (WAA) exists,
provided that µis absolutely continuous. We summarize the results in the one and two dimensional
setting in the following theorem.
Theorem 1.1 (Existence of minimizer in one and two dimensions).Let d= 1 and k>2. Then
for all µ∈ P2(R), there exists a unique solution of (WAA), and this solution may be expressed
explicitly as Ω=[a, b]for
a= 4C0−6C1, b = 6C1−2C0, C0=ZR
xdµ(x), C1=ZR
xF (x)dµ(x),(1.4)
where Fis the cumulative distribution function (CDF) of µ.
Let d= 2 and k>3. Then for all µ∈ P2,ac(R2), there exists a solution of (WAA).
In the two dimensional case, our proof of Theorem 1.1 relies on an explicit characterization of
the closure of {1Ω/|Ω|: Ω ∈Sk}in the narrow topology; see Proposition 3.1. The key difficulty
in proving a minimizer exists arises when considering the possibility that the minimizing sequence
converges to a limit point with lower dimensional support. Note, in particular, that limit points
with lower dimensional support are in general not uniformly distributed on their support; see
Figure 1. For this reason, not all limit points have a valid interpretation in terms of archetypal
analysis, in which the kvertices or archetypes of their support completely describe the measure via
their convex combinations. For this reason, it is essential that we rule out the possibility that a
limit point of this type achieves the infimum of W2(µ, 1Ω/|Ω|) over Ω ∈Sk. We succeed in doing
this when µ∈ P2,ac(R2) by adapting a perturbation argument of Cuesta-Albertos, et. al., [6] to
the case of convex polygons. In particular, we construct simple perturbations around degenerate
limit points that are both feasible for our constraint set and improve the value of the objective
function. It remains an open question how to extend this technique to µwhich are no absolutely
continuous with respect to L2; see Remark 3.2. Furthermore, our approach to proving the value of
the objective function improves strongly leverages the structure of the 2-Wasserstein metric, and a
different approach would be needed to extend the result to p-Wasserstein metrics for p6= 2.
While this perturbative approach allows us to overcome the difficulty of degenerate limit points
in two dimensions, our approach fails in dimensions d>3, since in the higher dimensional setting,
edges of polygons can cross in the limit, creating artificial vertices in the perturbations we consider,
so that they no longer belong to the constraint set; see Remark 3.3 and Figure 2. For this reason,
it remains an open problem whether minimizers of (WAA) exist for dimensions d>3.
Finally, in terms of uniqueness, we observe that, in dimensions higher than one, the minimizer
is clearly not unique: for example, if µis the uniform distribution on the unit ball, any rotation of
an optimal polytope would also be optimal. Understanding whether uniqueness holds up to such
invariances, remains an interesting open question.
As described above, our approach to proving existence of (WAA) in two dimensions proceeds
by adapting a perturbation argument of Cuesta, el. al. [6], which considered a related problem:
given µ∈ P2,ac(Rd), find the convex set Ω that minimizes W2(µ, 1Ω/|Ω|). This built on previous
work by the same authors, which also examined optimal W2approximation of measures µ∈ P(R)
in one dimension by empirical measures, uniform measures on intervals, and ellipsoids [7]. The