Related papers: Accelerating Sinkhorn Algorithm with Sparse Newton Iterations

Accelerating Sinkhorn Algorithm with Sparse Newton Iterations

URL: http://arxiv.org/abs/2401.12253v1
Date: Sat, 20 Jan 2024 21:23:09 GMT
Title: Accelerating Sinkhorn Algorithm with Sparse Newton Iterations
Authors: Xun Tang, Michael Shavlovsky, Holakou Rahmanian, Elisa Tardini, Kiran Koshy Thekumparampil, Tesi Xiao, Lexing Ying
Abstract summary: We present Sinkhorn-Newton-Sparse (SNS), an extension to the Sinkhorn algorithm. SNS converges orders magnitude faster across a wide range of practical cases.
Score: 14.094908995798757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Computing the optimal transport distance between statistical distributions is a fundamental task in machine learning. One remarkable recent advancement is entropic regularization and the Sinkhorn algorithm, which utilizes only matrix scaling and guarantees an approximated solution with near-linear runtime. Despite the success of the Sinkhorn algorithm, its runtime may still be slow due to the potentially large number of iterations needed for convergence. To achieve possibly super-exponential convergence, we present Sinkhorn-Newton-Sparse (SNS), an extension to the Sinkhorn algorithm, by introducing early stopping for the matrix scaling steps and a second stage featuring a Newton-type subroutine. Adopting the variational viewpoint that the Sinkhorn algorithm maximizes a concave Lyapunov potential, we offer the insight that the Hessian matrix of the potential function is approximately sparse. Sparsification of the Hessian results in a fast $O(n^2)$ per-iteration complexity, the same as the Sinkhorn algorithm. In terms of total iteration count, we observe that the SNS algorithm converges orders of magnitude faster across a wide range of practical cases, including optimal transportation between empirical distributions and calculating the Wasserstein $W_1, W_2$ distance of discretized densities. The empirical performance is corroborated by a rigorous bound on the approximate sparsity of the Hessian matrix.

Related papers

Sinkhorn Algorithm for Sequentially Composed Optimal Transports [0.0]
Sinkhorn algorithm is the de-facto standard approximation algorithm for optimal transport. We present an efficient approximation algorithm, namely Sinkhorn algorithm for sequentially composed optimal transports.
arXiv Detail & Related papers (2024-12-04T08:39:45Z)
A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal. Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Compressed online Sinkhorn [3.2534959204741085]
We revisit the recently introduced online Sinkhorn algorithm of [Mensch and Peyr'e, 2020]. We improve the convergence analysis for the online Sinkhorn algorithm, the new rate that we obtain is faster than the previous rate under certain parameter choices. Secondly, we propose the compressed online Sinkhorn algorithm which combines measure compression techniques with the online Sinkhorn algorithm.
arXiv Detail & Related papers (2023-10-08T05:33:32Z)
Non-asymptotic convergence bounds for Sinkhorn iterates and their gradients: a coupling approach [10.568851068989972]
We focus on a relaxation of the original OT problem, the entropic OT problem, which allows to implement efficient and practical algorithmic solutions. This formulation, also known as the Schr"odinger Bridge problem, notably connects with Optimal Control (SOC) and can be solved with the popular Sinkhorn algorithm.
arXiv Detail & Related papers (2023-04-13T13:58:25Z)
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications. This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z)
Hessian Averaging in Stochastic Newton Methods Achieves Superlinear Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method. We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z)
Efficient Optimal Transport Algorithm by Accelerated Gradient descent [20.614477547939845]
We propose a novel algorithm to further improve the efficiency and accuracy based on Nesterov's smoothing technique. The proposed method achieves faster convergence and better accuracy with the same parameter.
arXiv Detail & Related papers (2021-04-12T20:23:29Z)
ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms. We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients. We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.