Weighted quantization using MMD: From mean field to mean shift via gradient flows
- URL: http://arxiv.org/abs/2502.10600v1
- Date: Fri, 14 Feb 2025 23:13:20 GMT
- Title: Weighted quantization using MMD: From mean field to mean shift via gradient flows
- Authors: Ayoub Belhadji, Daniel Sharp, Youssef Marzouk,
- Abstract summary: Approximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics.
We show that MSIP extends the (non-interacting) mean shift algorithm, widely used for identifying modes in kernel density estimates.
We also show that MSIP can be interpreted as preconditioned gradient descent, and that it acts as a relaxation of Lloyd's algorithm for clustering.
- Score: 5.216151302783165
- License:
- Abstract: Approximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics, with applications including clustering and quantization. Formally, we seek a finite weighted mixture of Dirac measures that best approximates the target distribution. While much existing work relies on the Wasserstein distance to quantify approximation errors, maximum mean discrepancy (MMD) has received comparatively less attention, especially when allowing for variable particle weights. We study the quantization problem from the perspective of minimizing MMD via gradient flow in the Wasserstein-Fisher-Rao (WFR) geometry. This gradient flow yields an ODE system from which we further derive a fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the (non-interacting) mean shift algorithm, widely used for identifying modes in kernel density estimates. Moreover, we show that MSIP can be interpreted as preconditioned gradient descent, and that it acts as a relaxation of Lloyd's algorithm for clustering. Our numerical experiments demonstrate that MSIP and the WFR ODEs outperform other algorithms for quantization of multi-modal and high-dimensional targets.
Related papers
- Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models.
Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z) - Posterior Sampling Based on Gradient Flows of the MMD with Negative Distance Kernel [2.199065293049186]
conditional flows of the maximum mean discrepancy (MMD) with the negative distance kernel for posterior sampling and conditional generative modeling.
We approximate the joint distribution of the ground truth and the observations using discrete Wasserstein gradient flows.
arXiv Detail & Related papers (2023-10-04T11:40:02Z) - Multi-kernel Correntropy-based Orientation Estimation of IMUs: Gradient
Descent Methods [3.8286082196845466]
Correntropy-based descent gradient (CGD) and correntropy-based decoupled orientation estimation (CDOE)
Traditional methods rely on the mean squared error (MSE) criterion, making them vulnerable to external acceleration and magnetic interference.
New algorithms demonstrate significantly lower computational complexity than Kalman filter-based approaches.
arXiv Detail & Related papers (2023-04-13T13:57:33Z) - Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.
We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.
Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z) - Learning Gaussian Mixtures Using the Wasserstein-Fisher-Rao Gradient
Flow [12.455057637445174]
We propose a new algorithm to compute the nonparametric maximum likelihood estimator (NPMLE) in a Gaussian mixture model.
Our method is based on gradient descent over the space of probability measures equipped with the Wasserstein-Fisher-Rao geometry.
We conduct extensive numerical experiments to confirm the effectiveness of the proposed algorithm.
arXiv Detail & Related papers (2023-01-04T18:59:35Z) - A DeepParticle method for learning and generating aggregation patterns
in multi-dimensional Keller-Segel chemotaxis systems [3.6184545598911724]
We study a regularized interacting particle method for computing aggregation patterns and near singular solutions of a Keller-Segal (KS) chemotaxis system in two and three space dimensions.
We further develop DeepParticle (DP) method to learn and generate solutions under variations of physical parameters.
arXiv Detail & Related papers (2022-08-31T20:52:01Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows.
Our approach relies on input neural networks (ICNNs) to discretize the JKO steps.
As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.