Semi-Implicit Functional Gradient Flow for Efficient Sampling
- URL: http://arxiv.org/abs/2410.17935v2
- Date: Fri, 21 Mar 2025 12:56:31 GMT
- Title: Semi-Implicit Functional Gradient Flow for Efficient Sampling
- Authors: Shiyue Zhang, Ziheng Cheng, Cheng Zhang,
- Abstract summary: We propose a functional gradient ParVI method that uses perturbed particles with Gaussian noise as the approximation family.<n>We show that the corresponding functional gradient flow, which can be estimated via denoising score matching with neural networks, exhibits strong theoretical convergence guarantees.<n>In addition, we present an adaptive version of our method that automatically selects the appropriate noise magnitude during sampling.
- Score: 30.32233517392456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Particle-based variational inference methods (ParVIs) use nonparametric variational families represented by particles to approximate the target distribution according to the kernelized Wasserstein gradient flow for the Kullback-Leibler (KL) divergence. Although functional gradient flows have been introduced to expand the kernel space for better flexibility, the deterministic updating mechanism may limit exploration and require expensive repetitive runs for new samples. In this paper, we propose Semi-Implicit Functional Gradient flow (SIFG), a functional gradient ParVI method that uses perturbed particles with Gaussian noise as the approximation family. We show that the corresponding functional gradient flow, which can be estimated via denoising score matching with neural networks, exhibits strong theoretical convergence guarantees due to a higher-order smoothness brought to the approximation family via Gaussian perturbation. In addition, we present an adaptive version of our method that automatically selects the appropriate noise magnitude during sampling, striking a good balance between exploration efficiency and approximation accuracy. Extensive experiments on both simulated and real-world datasets demonstrate the effectiveness and efficiency of the proposed framework.
Related papers
- Functional Gradient Flows for Constrained Sampling [29.631753643887237]
We propose a new functional gradient ParVI method for constrained sampling, called constrained functional gradient flow (CFG)
We also present novel numerical strategies to handle the boundary integral term arising from the domain constraints.
arXiv Detail & Related papers (2024-10-30T16:20:48Z) - Conditional Lagrangian Wasserstein Flow for Time Series Imputation [3.914746375834628]
We propose a novel method for time series imputation called Conditional Lagrangian Wasserstein Flow.
The proposed method leverages the (conditional) optimal transport theory to learn the probability flow in a simulation-free manner.
The experimental results on the real-word datasets show that the proposed method achieves competitive performance on time series imputation.
arXiv Detail & Related papers (2024-10-10T02:46:28Z) - Dynamic Anisotropic Smoothing for Noisy Derivative-Free Optimization [0.0]
We propose a novel algorithm that extends the methods of ball smoothing and Gaussian smoothing for noisy derivative-free optimization.
The algorithm dynamically adapts the shape of the smoothing kernel to approximate the Hessian of the objective function around a local optimum.
arXiv Detail & Related papers (2024-05-02T21:04:20Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Neural Sinkhorn Gradient Flow [11.4522103360875]
We introduce the Neural Sinkhorn Gradient Flow (NSGF) model, which parametrizes the time-varying velocity field of the Wasserstein gradient flow.
Our theoretical analyses show that as the sample size increases to infinity, the mean-field limit of the empirical approximation converges to the true underlying velocity field.
To further enhance model efficiency on high-dimensional tasks, a two-phase NSGF++ model is devised.
arXiv Detail & Related papers (2024-01-25T10:44:50Z) - Joint State Estimation and Noise Identification Based on Variational
Optimization [8.536356569523127]
A novel adaptive Kalman filter method based on conjugate-computation variational inference, referred to as CVIAKF, is proposed.
The effectiveness of CVIAKF is validated through synthetic and real-world datasets of maneuvering target tracking.
arXiv Detail & Related papers (2023-12-15T07:47:03Z) - Particle-based Variational Inference with Generalized Wasserstein
Gradient Flow [32.37056212527921]
We propose a ParVI framework, called generalized Wasserstein gradient descent (GWG)
We show that GWG exhibits strong convergence guarantees.
We also provide an adaptive version that automatically chooses Wasserstein metric to accelerate convergence.
arXiv Detail & Related papers (2023-10-25T10:05:42Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels [78.6096486885658]
We introduce lower bounds to the linearized Laplace approximation of the marginal likelihood.
These bounds are amenable togradient-based optimization and allow to trade off estimation accuracy against computational complexity.
arXiv Detail & Related papers (2023-06-06T19:02:57Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Particle-based Variational Inference with Preconditioned Functional
Gradient Flow [13.519223374081648]
We propose a new particle-based variational inference algorithm called preconditioned functional gradient flow (PFG)
PFG has several advantages over Stein variational gradient descent (SVGD)
Non-linear function classes such as neural networks can be incorporated to estimate the gradient flow.
arXiv Detail & Related papers (2022-11-25T08:31:57Z) - Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED)
MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs)
We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z) - A blob method method for inhomogeneous diffusion with applications to
multi-agent control and sampling [0.6562256987706128]
We develop a deterministic particle method for the weighted porous medium equation (WPME) and prove its convergence on bounded time intervals.
Our method has natural applications to multi-agent coverage algorithms and sampling probability measures.
arXiv Detail & Related papers (2022-02-25T19:49:05Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization [106.70006655990176]
A distributional optimization problem arises widely in machine learning and statistics.
We propose a novel particle-based algorithm, dubbed as variational transport, which approximately performs Wasserstein gradient descent.
We prove that when the objective function satisfies a functional version of the Polyak-Lojasiewicz (PL) (Polyak, 1963) and smoothness conditions, variational transport converges linearly.
arXiv Detail & Related papers (2020-12-21T18:33:13Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.