Fast Gradient Computation for Gromov-Wasserstein Distance
- URL: http://arxiv.org/abs/2404.08970v1
- Date: Sat, 13 Apr 2024 11:23:34 GMT
- Title: Fast Gradient Computation for Gromov-Wasserstein Distance
- Authors: Wei Zhang, Zihao Wang, Jie Fan, Hao Wu, Yong Zhang,
- Abstract summary: The Gromov-Wasserstein distance is a notable extension of optimal transport.
The computation of the Gromov-Wasserstein distance and transport plan is expensive.
We propose a novel method to accelerate accurate gradient computation by dynamic programming techniques.
- Score: 17.47684854121659
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Gromov-Wasserstein distance is a notable extension of optimal transport. In contrast to the classic Wasserstein distance, it solves a quadratic assignment problem that minimizes the pair-wise distance distortion under the transportation of distributions and thus could apply to distributions in different spaces. These properties make Gromov-Wasserstein widely applicable to many fields, such as computer graphics and machine learning. However, the computation of the Gromov-Wasserstein distance and transport plan is expensive. The well-known Entropic Gromov-Wasserstein approach has a cubic complexity since the matrix multiplication operations need to be repeated in computing the gradient of Gromov-Wasserstein loss. This becomes a key bottleneck of the method. Currently, existing methods accelerate the computation focus on sampling and approximation, which leads to low accuracy or incomplete transport plan. In this work, we propose a novel method to accelerate accurate gradient computation by dynamic programming techniques, reducing the complexity from cubic to quadratic. In this way, the original computational bottleneck is broken and the new entropic solution can be obtained with total quadratic time, which is almost optimal complexity. Furthermore, it can be extended to some variants easily. Extensive experiments validate the efficiency and effectiveness of our method.
Related papers
- Covariance alignment: from maximum likelihood estimation to
Gromov-Wasserstein [27.2585517709102]
We show that the celebrated Gromov-Wasserstein algorithm from optimal transport is also minimax optimal.
These results give the first statistical justification for the deployment of the Gromov-Wasserstein algorithm in practice.
arXiv Detail & Related papers (2023-11-22T18:55:27Z) - Approximation Theory, Computing, and Deep Learning on the Wasserstein Space [0.5735035463793009]
We address the challenge of approximating functions in infinite-dimensional spaces from finite samples.
Our focus is on the Wasserstein distance function, which serves as a relevant example.
We adopt three machine learning-based approaches to define functional approximants.
arXiv Detail & Related papers (2023-10-30T13:59:47Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Algorithms for perturbative analysis and simulation of quantum dynamics [0.0]
We develop general purpose algorithms for computing and utilizing both the Dyson series and Magnus expansion.
We demonstrate how to use these tools to approximate fidelity in a region of model parameter space.
We show how the pre-computation step can be phrased as a multivariable expansion problem with fewer terms than in the original method.
arXiv Detail & Related papers (2022-10-20T21:07:47Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows.
Our approach relies on input neural networks (ICNNs) to discretize the JKO steps.
As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - A Riemannian Block Coordinate Descent Method for Computing the
Projection Robust Wasserstein Distance [36.97843660480747]
The Wasserstein distance has become increasingly important in machine learning and deep learning.
A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data onto a lower-dimensional subspace, and then compute the Wasserstein distance between the projected data.
However, this approach requires to solve a max-min problem over the Stiefel manifold, which is very challenging in practice.
We propose a novel block coordinate descent (RBCD) method to solve this problem, which is based on a novel reformulation of the regularized max-min problem over the Stiefel manifold.
arXiv Detail & Related papers (2020-12-09T17:47:56Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.