Related papers: Neural Sinkhorn Gradient Flow

Neural Sinkhorn Gradient Flow

URL: http://arxiv.org/abs/2401.14069v1
Date: Thu, 25 Jan 2024 10:44:50 GMT
Title: Neural Sinkhorn Gradient Flow
Authors: Huminhao Zhu, Fangyikang Wang, Chao Zhang, Hanbin Zhao, Hui Qian
Abstract summary: We introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which parametrizes the time-varying velocity field of the Wasserstein gradient flow. Our theoretical analyses show that as the sample size increases to infinity, the mean-field limit of the empirical approximation converges to the true underlying velocity field. To further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++ model is devised.
Score: 11.4522103360875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Wasserstein Gradient Flows (WGF) with respect to specific functionals have been widely used in the machine learning literature. Recently, neural networks have been adopted to approximate certain intractable parts of the underlying Wasserstein gradient flow and result in efficient inference procedures. In this paper, we introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which parametrizes the time-varying velocity field of the Wasserstein gradient flow w.r.t. the Sinkhorn divergence to the target distribution starting a given source distribution. We utilize the velocity field matching training scheme in NSGF, which only requires samples from the source and target distribution to compute an empirical velocity field approximation. Our theoretical analyses show that as the sample size increases to infinity, the mean-field limit of the empirical approximation converges to the true underlying velocity field. To further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++ model is devised, which first follows the Sinkhorn flow to approach the image manifold quickly ($\le 5$ NFEs) and then refines the samples along a simple straight flow. Numerical experiments with synthetic and real-world benchmark datasets support our theoretical results and demonstrate the effectiveness of the proposed methods.

Related papers

On the minimax optimality of Flow Matching through the connection to kernel density estimation [0.0]
Flow Matching is a simple and flexible alternative to diffusion models. We prove that Flow Matching matches the optimal rate of convergence in Wasserstein distance up to logarithmic factors. We also provide a first justification of Flow Matching's effectiveness in high-dimensional settings.
arXiv Detail & Related papers (2025-04-17T21:06:41Z)
Implicit factorized transformer approach to fast prediction of turbulent channel flows [6.70175842351963]
We introduce a modified implicit factorized transformer (IFactFormer-m) model which replaces the original chained factorized attention with parallel factorized attention.<n>The IFactFormer-m model successfully performs long-term predictions for turbulent channel flow.
arXiv Detail & Related papers (2024-12-25T09:05:14Z)
Learning Pore-scale Multi-phase Flow from Experimental Data with Graph Neural Network [2.2101344151283944]
Current numerical models are often incapable of accurately capturing the complex pore-scale physics observed in experiments. We propose a graph neural network-based approach and directly learn pore-scale fluid flow using micro-CT experimental data.
arXiv Detail & Related papers (2024-11-21T15:01:17Z)
Kernel Approximation of Fisher-Rao Gradient Flows [52.154685604660465]
We present a rigorous investigation of Fisher-Rao and Wasserstein type gradient flows concerning their gradient structures, flow equations, and their kernel approximations. Specifically, we focus on the Fisher-Rao geometry and its various kernel-based approximations, developing a principled theoretical framework.
arXiv Detail & Related papers (2024-10-27T22:52:08Z)
Semi-Implicit Functional Gradient Flow [30.32233517392456]
We propose a functional gradient ParVI method that uses perturbed particles as the approximation family. The corresponding functional gradient flow, which can be estimated via denoising score matching, exhibits strong theoretical convergence guarantee.
arXiv Detail & Related papers (2024-10-23T15:00:30Z)
Straightness of Rectified Flow: A Theoretical Insight into Wasserstein Convergence [54.580605276017096]
Diffusion models have emerged as a powerful tool for image generation and denoising. Recently, Liu et al. designed a novel alternative generative model Rectified Flow (RF) RF aims to learn straight flow trajectories from noise to data using a sequence of convex optimization problems.
arXiv Detail & Related papers (2024-10-19T02:36:11Z)
Conditional Lagrangian Wasserstein Flow for Time Series Imputation [3.914746375834628]
We propose a novel method for time series imputation called Conditional Lagrangian Wasserstein Flow. The proposed method leverages the (conditional) optimal transport theory to learn the probability flow in a simulation-free manner. The experimental results on the real-word datasets show that the proposed method achieves competitive performance on time series imputation.
arXiv Detail & Related papers (2024-10-10T02:46:28Z)
Improving Consistency Models with Generator-Induced Flows [16.049476783301724]
Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network. They can be learned in two ways: consistency distillation and consistency training. We propose a novel flow that transports noisy data towards their corresponding outputs derived from the currently trained model.
arXiv Detail & Related papers (2024-06-13T20:22:38Z)
DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space. We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process. Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z)
Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method. We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z)
Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z)
A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs) We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation. Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.