Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
- URL: http://arxiv.org/abs/2406.04056v2
- Date: Tue, 11 Jun 2024 20:53:28 GMT
- Title: Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
- Authors: Sergio Calo, Anders Jonsson, Gergely Neu, Ludovic Schwartz, Javier Segovia,
- Abstract summary: We propose a new framework for formulating optimal transport distances between Markov chains.
We show that calculating optimal transport distances in the full space of joint distributions can be equivalently formulated as solving a linear program.
- Score: 14.262270388108112
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new framework for formulating optimal transport distances between Markov chains. Previously known formulations studied couplings between the entire joint distribution induced by the chains, and derived solutions via a reduction to dynamic programming (DP) in an appropriately defined Markov decision process. This formulation has, however, not led to particularly efficient algorithms so far, since computing the associated DP operators requires fully solving a static optimal transport problem, and these operators need to be applied numerous times during the overall optimization process. In this work, we develop an alternative perspective by considering couplings between a flattened version of the joint distributions that we call discounted occupancy couplings, and show that calculating optimal transport distances in the full space of joint distributions can be equivalently formulated as solving a linear program (LP) in this reduced space. This LP formulation allows us to port several algorithmic ideas from other areas of optimal transport theory. In particular, our formulation makes it possible to introduce an appropriate notion of entropy regularization into the optimization problem, which in turn enables us to directly calculate optimal transport distances via a Sinkhorn-like method we call Sinkhorn Value Iteration (SVI). We show both theoretically and empirically that this method converges quickly to an optimal coupling, essentially at the same computational cost of running vanilla Sinkhorn in each pair of states. Along the way, we point out that our optimal transport distance exactly matches the common notion of bisimulation metrics between Markov chains, and thus our results also apply to computing such metrics, and in fact our algorithm turns out to be significantly more efficient than the best known methods developed so far for this purpose.
Related papers
- Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target.
We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions.
PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z) - A Sinkhorn-type Algorithm for Constrained Optimal Transport [14.935188572016635]
This work focuses on a general class of OT problems under a combination of equality and inequality constraints.
We derive the corresponding entropy regularization formulation and introduce a Sinkhorn-type algorithm for such constrained OT problems supported by theoretical guarantees.
Overall, this work systematically combines recent theoretical and numerical advances in entropic optimal transport with the constrained case, allowing practitioners to derive approximate transport plans in complex scenarios.
arXiv Detail & Related papers (2024-03-08T05:01:43Z) - OTClean: Data Cleaning for Conditional Independence Violations using
Optimal Transport [51.6416022358349]
sys is a framework that harnesses optimal transport theory for data repair under Conditional Independence (CI) constraints.
We develop an iterative algorithm inspired by Sinkhorn's matrix scaling algorithm, which efficiently addresses high-dimensional and large-scale data.
arXiv Detail & Related papers (2024-03-04T18:23:55Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Optimal Transport with Tempered Exponential Measures [33.07787452859956]
We show that a generalization of exponential families with indirect measure normalization gets to a very convenient middle ground.
Our formulation fits naturally in the unbalanced optimal transport problem setting.
arXiv Detail & Related papers (2023-09-07T20:53:23Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Improving Approximate Optimal Transport Distances using Quantization [23.319746583489763]
Optimal transport is a popular tool in machine learning to compare probability measures geometrically.
Linear programming algorithms for computing OT scale cubically in the size of the input, making OT impractical in the large-sample regime.
We introduce a practical algorithm, which relies on a quantization step, to estimate OT distances between measures given cheap sample access.
arXiv Detail & Related papers (2021-02-25T08:45:06Z) - FedSplit: An algorithmic framework for fast federated optimization [40.42352500741025]
We introduce FedSplit, a class of algorithms for solving distributed convex minimization with additive structure.
Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
arXiv Detail & Related papers (2020-05-11T16:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.