A note on the relations between mixture models, maximum-likelihood and entropic optimal transport
- URL: http://arxiv.org/abs/2501.12005v2
- Date: Thu, 23 Jan 2025 09:47:30 GMT
- Title: A note on the relations between mixture models, maximum-likelihood and entropic optimal transport
- Authors: Titouan Vayer, Etienne Lasalle,
- Abstract summary: We show that performing maximum-likelihood estimation for a mixture model is equivalent to minimizing over the parameters an optimal transport problem with entropic regularization.
- Score: 6.246185995463311
- License:
- Abstract: This note aims to demonstrate that performing maximum-likelihood estimation for a mixture model is equivalent to minimizing over the parameters an optimal transport problem with entropic regularization. The objective is pedagogical: we seek to present this already known result in a concise and hopefully simple manner. We give an illustration with Gaussian mixture models by showing that the standard EM algorithm is a specific block-coordinate descent on an optimal transport loss.
Related papers
- Dynamical Measure Transport and Neural PDE Solvers for Sampling [77.38204731939273]
We tackle the task of sampling from a probability density as transporting a tractable density function to the target.
We employ physics-informed neural networks (PINNs) to approximate the respective partial differential equations (PDEs) solutions.
PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently.
arXiv Detail & Related papers (2024-07-10T17:39:50Z) - Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED)
MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs)
We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z) - An Homogeneous Unbalanced Regularized Optimal Transport model with
applications to Optimal Transport with Boundary [0.0]
We show how the introduction of the entropic regularization term in unbalanced Optimal Transport (OT) models may alter their homogeneity with respect to the input measures.
We propose to modify the entropic regularization term to retrieve an UROT model that is homogeneous while preserving most properties of the standard UROT model.
arXiv Detail & Related papers (2022-01-06T14:55:30Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - On the Existence of Optimal Transport Gradient for Learning Generative
Models [8.602553195689513]
Training of Wasserstein Generative Adversarial Networks (WGAN) relies on the calculation of the gradient of the optimal transport cost.
We first demonstrate that such gradient may not be defined, which can result in numerical instabilities during gradient-based optimization.
By exploiting the discrete nature of empirical data, we formulate the gradient in a semi-discrete setting and propose an algorithm for the optimization of the generative model parameters.
arXiv Detail & Related papers (2021-02-10T16:28:20Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z) - A Rigorous Link Between Self-Organizing Maps and Gaussian Mixture Models [78.6363825307044]
This work presents a mathematical treatment of the relation between Self-Organizing Maps (SOMs) and Gaussian Mixture Models (GMMs)
We show that energy-based SOM models can be interpreted as performing gradient descent.
This link allows to treat SOMs as generative probabilistic models, giving a formal justification for using SOMs to detect outliers, or for sampling.
arXiv Detail & Related papers (2020-09-24T14:09:04Z) - Sinkhorn EM: An Expectation-Maximization algorithm based on entropic
optimal transport [11.374487003189467]
Sinkhorn EM (sEM) is a variant of the expectation (EM) algorithm for mixtures based on entropic optimal transport.
We show theoretically and empirically that sEM has better behavior than EM.
arXiv Detail & Related papers (2020-06-30T06:03:37Z) - A Near-Optimal Gradient Flow for Learning Neural Energy-Based Models [93.24030378630175]
We propose a novel numerical scheme to optimize the gradient flows for learning energy-based models (EBMs)
We derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
Compared with existing schemes, Wasserstein gradient flow is a smoother and near-optimal numerical scheme to approximate real data densities.
arXiv Detail & Related papers (2019-10-31T02:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.