A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport
- URL: http://arxiv.org/abs/2310.14087v2
- Date: Tue, 30 Jan 2024 20:23:03 GMT
- Title: A Specialized Semismooth Newton Method for Kernel-Based Optimal
Transport
- Authors: Tianyi Lin, Marco Cuturi and Michael I. Jordan
- Abstract summary: Kernel-based optimal transport (OT) estimators offer an alternative, functional estimation procedure to address OT problems from samples.
We show that our SSN method achieves a global convergence rate of $O (1/sqrtk)$, and a local quadratic convergence rate under standard regularity conditions.
- Score: 92.96250725599958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Kernel-based optimal transport (OT) estimators offer an alternative,
functional estimation procedure to address OT problems from samples. Recent
works suggest that these estimators are more statistically efficient than
plug-in (linear programming-based) OT estimators when comparing probability
measures in high-dimensions~\citep{Vacher-2021-Dimension}. Unfortunately, that
statistical benefit comes at a very steep computational price: because their
computation relies on the short-step interior-point method (SSIPM), which comes
with a large iteration count in practice, these estimators quickly become
intractable w.r.t. sample size $n$. To scale these estimators to larger $n$, we
propose a nonsmooth fixed-point model for the kernel-based OT problem, and show
that it can be efficiently solved via a specialized semismooth Newton (SSN)
method: We show, exploring the problem's structure, that the per-iteration cost
of performing one SSN step can be significantly reduced in practice. We prove
that our SSN method achieves a global convergence rate of $O(1/\sqrt{k})$, and
a local quadratic convergence rate under standard regularity conditions. We
show substantial speedups over SSIPM on both synthetic and real datasets.
Related papers
- Progressive Entropic Optimal Transport Solvers [33.821924561619895]
We propose a new class of EOT solvers (ProgOT) that can estimate both plans and transport maps.
We provide experimental evidence demonstrating that ProgOT is a faster and more robust alternative to standard solvers.
We also prove statistical consistency of our approach for estimating optimal transport maps.
arXiv Detail & Related papers (2024-06-07T16:33:08Z) - Semi-Discrete Optimal Transport: Nearly Minimax Estimation With Stochastic Gradient Descent and Adaptive Entropic Regularization [38.67914746910537]
We prove an $mathcalO(t-1)$ lower bound rate for the OT map, using the similarity between Laguerre cells estimation and density support estimation.
To nearly achieve the desired fast rate, we design an entropic regularization scheme decreasing with the number of samples.
arXiv Detail & Related papers (2024-05-23T11:46:03Z) - Efficiently Escaping Saddle Points for Non-Convex Policy Optimization [40.0986936439803]
Policy gradient (PG) is widely used in reinforcement learning due to its scalability and good performance.
We propose a variance-reduced second-order method that uses second-order information in the form of Hessian vector products (HVP) and converges to an approximate second-order stationary point (SOSP) with sample complexity of $tildeO(epsilon-3)$.
arXiv Detail & Related papers (2023-11-15T12:36:45Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - On Computationally Efficient Learning of Exponential Family
Distributions [33.229944519289795]
We focus on the setting where the support as well as the natural parameters are appropriately bounded.
Our method achives the order-optimal sample complexity of $O(sf log(k)/alpha2)$ when tailored for node-wise-sparse random fields.
arXiv Detail & Related papers (2023-09-12T17:25:32Z) - Achieving the Asymptotically Optimal Sample Complexity of Offline Reinforcement Learning: A DRO-Based Approach [36.88301225561535]
offline reinforcement learning aims to learn from pre-collected datasets without active exploration.
Existing approaches adopt a pessimistic stance towards uncertainty by penalizing rewards of under-explored state-action pairs to estimate value functions conservatively.
We show that the distributionally robust optimization (DRO) based approach can also address these challenges and is asymptotically minimax optimal
arXiv Detail & Related papers (2023-05-22T17:50:18Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Sharper Rates and Flexible Framework for Nonconvex SGD with Client and
Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions.
We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism.
We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z) - Support recovery and sup-norm convergence rates for sparse pivotal
estimation [79.13844065776928]
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level.
We show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators.
arXiv Detail & Related papers (2020-01-15T16:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.