Abstract: Collecting and aggregating information from several probability measures or
histograms is a fundamental task in machine learning. One of the popular
solution methods for this task is to compute the barycenter of the probability
measures under the Wasserstein metric. However, approximating the Wasserstein
barycenter is numerically challenging because of the curse of dimensionality.
This paper proposes the projection robust Wasserstein barycenter (PRWB) that
mitigates the curse of dimensionality. This new model projects the probability
measures onto a lower-dimensional subspace that maximizes the Wasserstein
barycenter objective. The resulting problem is a max-min problem over the
Stiefel manifold, which is numerically challenging in practice. Combining the
iterative Bregman projection algorithm and Riemannian optimization, we propose
two new algorithms for computing the PRWB. The complexity of arithmetic
operations of the proposed algorithms for obtaining an $\epsilon$-stationary
solution is analyzed. We incorporate the PRWB into a discrete distribution
clustering algorithm, and the numerical results on real text datasets confirm
that our PRWB model helps improve the clustering performance significantly.