Likelihood Training of Schr\"odinger Bridge using Forward-Backward SDEs
Theory
- URL: http://arxiv.org/abs/2110.11291v5
- Date: Mon, 3 Apr 2023 08:50:44 GMT
- Title: Likelihood Training of Schr\"odinger Bridge using Forward-Backward SDEs
Theory
- Authors: Tianrong Chen, Guan-Horng Liu, Evangelos A. Theodorou
- Abstract summary: It remains unclear whether the optimization principle of SB relates to the modern training of deep generative models.
We present a novel computational framework for likelihood training of SB models grounded on Forward-Backward Theory.
We show that the resulting training achieves comparable results on generating realistic images on MNIST, CelebA, and CIFAR10.
- Score: 29.82841891919951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Schr\"odinger Bridge (SB) is an entropy-regularized optimal transport problem
that has received increasing attention in deep generative modeling for its
mathematical flexibility compared to the Scored-based Generative Model (SGM).
However, it remains unclear whether the optimization principle of SB relates to
the modern training of deep generative models, which often rely on constructing
log-likelihood objectives.This raises questions on the suitability of SB models
as a principled alternative for generative applications. In this work, we
present a novel computational framework for likelihood training of SB models
grounded on Forward-Backward Stochastic Differential Equations Theory - a
mathematical methodology appeared in stochastic optimal control that transforms
the optimality condition of SB into a set of SDEs. Crucially, these SDEs can be
used to construct the likelihood objectives for SB that, surprisingly,
generalizes the ones for SGM as special cases. This leads to a new optimization
principle that inherits the same SB optimality yet without losing applications
of modern generative training techniques, and we show that the resulting
training algorithm achieves comparable results on generating realistic images
on MNIST, CelebA, and CIFAR10. Our code is available at
https://github.com/ghliu/SB-FBSDE.
Related papers
- Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes.
Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient.
Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z) - Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball.
We propose a new family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems.
arXiv Detail & Related papers (2025-02-11T13:10:34Z) - Go With the Flow: Fast Diffusion for Gaussian Mixture Models [13.03355083378673]
Schr"odinger Bridges (SB) are diffusion processes that steer, in finite time, a given initial distribution to another final one while minimizing a suitable cost functional.
We propose latentmetrization of a set of SB policies for steering a system from one distribution to another.
We showcase the potential this approach in low-to-dimensional problems such as image-to-image translation in the space of an autoencoder.
arXiv Detail & Related papers (2024-12-12T08:40:22Z) - Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models [54.132297393662654]
We introduce a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL.
We demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models.
arXiv Detail & Related papers (2024-05-30T03:57:29Z) - Model-based Causal Bayesian Optimization [74.78486244786083]
We introduce the first algorithm for Causal Bayesian Optimization with Multiplicative Weights (CBO-MW)
We derive regret bounds for CBO-MW that naturally depend on graph-related quantities.
Our experiments include a realistic demonstration of how CBO-MW can be used to learn users' demand patterns in a shared mobility system.
arXiv Detail & Related papers (2023-07-31T13:02:36Z) - Building the Bridge of Schr\"odinger: A Continuous Entropic Optimal
Transport Benchmark [96.06787302688595]
We propose a novel way to create pairs of probability distributions for which the ground truth OT solution is known by the construction.
We use these benchmark pairs to test how well existing neural EOT/SB solvers actually compute the EOT solution.
arXiv Detail & Related papers (2023-06-16T20:03:36Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - The Schr\"odinger Bridge between Gaussian Measures has a Closed Form [101.79851806388699]
We focus on the dynamic formulation of OT, also known as the Schr"odinger bridge (SB) problem.
In this paper, we provide closed-form expressions for SBs between Gaussian measures.
arXiv Detail & Related papers (2022-02-11T15:59:01Z) - Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders [22.54887526392739]
We propose a novel approach to training models with deep-latent hierarchies based on Optimal Transport.
We show that our method enables the generative model to fully leverage its deep-latent hierarchy, avoiding the well known "latent variable collapse" issue of VAEs.
arXiv Detail & Related papers (2020-10-07T15:04:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.