Neural Sinkhorn Gradient Flow
- URL: http://arxiv.org/abs/2401.14069v1
- Date: Thu, 25 Jan 2024 10:44:50 GMT
- Title: Neural Sinkhorn Gradient Flow
- Authors: Huminhao Zhu, Fangyikang Wang, Chao Zhang, Hanbin Zhao, Hui Qian
- Abstract summary: We introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which parametrizes the time-varying velocity field of the Wasserstein gradient flow.
Our theoretical analyses show that as the sample size increases to infinity, the mean-field limit of the empirical approximation converges to the true underlying velocity field.
To further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++ model is devised.
- Score: 11.4522103360875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wasserstein Gradient Flows (WGF) with respect to specific functionals have
been widely used in the machine learning literature. Recently, neural networks
have been adopted to approximate certain intractable parts of the underlying
Wasserstein gradient flow and result in efficient inference procedures. In this
paper, we introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which
parametrizes the time-varying velocity field of the Wasserstein gradient flow
w.r.t. the Sinkhorn divergence to the target distribution starting a given
source distribution. We utilize the velocity field matching training scheme in
NSGF, which only requires samples from the source and target distribution to
compute an empirical velocity field approximation. Our theoretical analyses
show that as the sample size increases to infinity, the mean-field limit of the
empirical approximation converges to the true underlying velocity field. To
further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++
model is devised, which first follows the Sinkhorn flow to approach the image
manifold quickly ($\le 5$ NFEs) and then refines the samples along a simple
straight flow. Numerical experiments with synthetic and real-world benchmark
datasets support our theoretical results and demonstrate the effectiveness of
the proposed methods.
Related papers
- From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems [20.006163951844357]
We propose a simulation-free framework for training neural ordinary differential equations (NODEs)
We employ the Fourier analysis to estimate temporal and potential high-order spatial gradients from noisy observational data.
Our approach outperforms state-of-the-art methods in terms of training time, dynamics prediction, and robustness.
arXiv Detail & Related papers (2024-05-19T13:15:23Z) - Gradient Flow Based Phase-Field Modeling Using Separable Neural Networks [1.2277343096128712]
We propose a separable neural network-based approximation of the phase field in a minimizing movement scheme to solve a gradient flow problem.
The proposed method outperforms the state-of-the-art machine learning methods for phase separation problems.
arXiv Detail & Related papers (2024-05-09T21:53:27Z) - Liouville Flow Importance Sampler [2.3603292593876324]
We present the Liouville Flow Importance Sampler (LFIS), an innovative flow-based model for generating samples from unnormalized density functions.
LFIS learns a time-dependent velocity field that deterministically transports samples from a simple initial distribution to a complex target distribution.
We demonstrate the effectiveness of LFIS through its application to a range of benchmark problems, on many of which LFIS achieved state-of-the-art performance.
arXiv Detail & Related papers (2024-05-03T16:44:31Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimiax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - DiffuSeq-v2: Bridging Discrete and Continuous Text Spaces for
Accelerated Seq2Seq Diffusion Models [58.450152413700586]
We introduce a soft absorbing state that facilitates the diffusion model in learning to reconstruct discrete mutations based on the underlying Gaussian space.
We employ state-of-the-art ODE solvers within the continuous space to expedite the sampling process.
Our proposed method effectively accelerates the training convergence by 4x and generates samples of similar quality 800x faster.
arXiv Detail & Related papers (2023-10-09T15:29:10Z) - Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models.
Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method.
We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z) - A robust single-pixel particle image velocimetry based on fully
convolutional networks with cross-correlation embedded [3.3579727024861064]
We propose a new velocity field estimation paradigm, which achieves a synergetic combination of the deep learning method and the traditional cross-correlation method.
The deep learning method is used to optimize and correct a coarse velocity guess to achieve a super-resolution calculation.
As a reference, the coarse velocity guess helps with improving the robustness of the proposed algorithm.
arXiv Detail & Related papers (2021-10-31T03:26:08Z) - Neural Particle Image Velocimetry [4.416484585765027]
We introduce a convolutional neural network adapted to the problem, namely Volumetric Correspondence Network (VCN)
The network is thoroughly trained and tested on a dataset containing both synthetic and real flow data.
Our analysis indicates that the proposed approach provides improved efficiency also keeping accuracy on par with other state-of-the-art methods in the field.
arXiv Detail & Related papers (2021-01-28T12:03:39Z) - Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed.
This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.