Neural Sinkhorn Gradient Flow
- URL: http://arxiv.org/abs/2401.14069v1
- Date: Thu, 25 Jan 2024 10:44:50 GMT
- Title: Neural Sinkhorn Gradient Flow
- Authors: Huminhao Zhu, Fangyikang Wang, Chao Zhang, Hanbin Zhao, Hui Qian
- Abstract summary: We introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which parametrizes the time-varying velocity field of the Wasserstein gradient flow.
Our theoretical analyses show that as the sample size increases to infinity, the mean-field limit of the empirical approximation converges to the true underlying velocity field.
To further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++ model is devised.
- Score: 11.4522103360875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wasserstein Gradient Flows (WGF) with respect to specific functionals have
been widely used in the machine learning literature. Recently, neural networks
have been adopted to approximate certain intractable parts of the underlying
Wasserstein gradient flow and result in efficient inference procedures. In this
paper, we introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which
parametrizes the time-varying velocity field of the Wasserstein gradient flow
w.r.t. the Sinkhorn divergence to the target distribution starting a given
source distribution. We utilize the velocity field matching training scheme in
NSGF, which only requires samples from the source and target distribution to
compute an empirical velocity field approximation. Our theoretical analyses
show that as the sample size increases to infinity, the mean-field limit of the
empirical approximation converges to the true underlying velocity field. To
further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++
model is devised, which first follows the Sinkhorn flow to approach the image
manifold quickly ($\le 5$ NFEs) and then refines the samples along a simple
straight flow. Numerical experiments with synthetic and real-world benchmark
datasets support our theoretical results and demonstrate the effectiveness of
the proposed methods.
Related papers
- Learning Pore-scale Multi-phase Flow from Experimental Data with Graph Neural Network [2.2101344151283944]
Current numerical models are often incapable of accurately capturing the complex pore-scale physics observed in experiments.
We propose a graph neural network-based approach and directly learn pore-scale fluid flow using micro-CT experimental data.
arXiv Detail & Related papers (2024-11-21T15:01:17Z) - Kernel Approximation of Fisher-Rao Gradient Flows [52.154685604660465]
We present a rigorous investigation of Fisher-Rao and Wasserstein type gradient flows concerning their gradient structures, flow equations, and their kernel approximations.
Specifically, we focus on the Fisher-Rao geometry and its various kernel-based approximations, developing a principled theoretical framework.
arXiv Detail & Related papers (2024-10-27T22:52:08Z) - Semi-Implicit Functional Gradient Flow [30.32233517392456]
We propose a functional gradient ParVI method that uses perturbed particles as the approximation family.
The corresponding functional gradient flow, which can be estimated via denoising score matching, exhibits strong theoretical convergence guarantee.
arXiv Detail & Related papers (2024-10-23T15:00:30Z) - Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence [54.580605276017096]
Diffusion models have emerged as a powerful tool for image generation and denoising.
Recently, Liu et al. designed a novel alternative generative model Rectified Flow (RF)
RF aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems.
arXiv Detail & Related papers (2024-10-19T02:36:11Z) - Conditional Lagrangian Wasserstein Flow for Time Series Imputation [3.914746375834628]
We propose a novel method for time series imputation called Conditional Lagrangian Wasserstein Flow.
The proposed method leverages the (conditional) optimal transport theory to learn the probability flow in a simulation-free manner.
The experimental results on the real-word datasets show that the proposed method achieves competitive performance on time series imputation.
arXiv Detail & Related papers (2024-10-10T02:46:28Z) - Improving Consistency Models with Generator-Induced Flows [16.049476783301724]
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network.
They can be learned in two ways: consistency distillation and consistency training.
We propose a novel flow that transports noisy data towards their corresponding outputs derived from the currently trained model.
arXiv Detail & Related papers (2024-06-13T20:22:38Z) - DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for
Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space.
We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process.
Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z) - Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models.
Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method.
We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z) - Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed.
This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.