Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures
- URL: http://arxiv.org/abs/2602.12923v1
- Date: Fri, 13 Feb 2026 13:28:23 GMT
- Title: Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures
- Authors: Luigi Fogliani, Bruno Loureiro, Marylou GabriƩ,
- Abstract summary: We provide a mathematical analysis of strategies for mitigating mode collapse in a tractable setting.<n>Our analysis shows that an appropriately chosen annealing scheme can robustly prevent mode collapse.<n>We present numerical evidence that these theoretical tradeoffs qualitatively extend to neural network based models, RealNVP normalizing flows.
- Score: 12.937511747845436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mode collapse, the failure to capture one or more modes when targetting a multimodal distribution, is a central challenge in modern variational inference. In this work, we provide a mathematical analysis of annealing based strategies for mitigating mode collapse in a tractable setting: learning a Gaussian mixture, where mode collapse is known to arise. Leveraging a low dimensional summary statistics description, we precisely characterize the interplay between the initial temperature and the annealing rate, and derive a sharp formula for the probability of mode collapse. Our analysis shows that an appropriately chosen annealing scheme can robustly prevent mode collapse. Finally, we present numerical evidence that these theoretical tradeoffs qualitatively extend to neural network based models, RealNVP normalizing flows, providing guidance for designing annealing strategies mitigating mode collapse in practical variational inference pipelines.
Related papers
- Conditional Diffusion Guidance under Hard Constraint: A Stochastic Analysis Approach [7.504703549763421]
We study conditional generation in diffusion models under hard constraints, where generated samples must satisfy prescribed events with probability one.<n>We develop a principled conditional diffusion guidance framework based on Doob's h-transform, martingale representation and quadratic variation process.<n>We provide non-asymptotic guarantees for the resulting conditional sampler in both total variation and Wasserstein distances.
arXiv Detail & Related papers (2026-02-05T10:46:20Z) - Generative Modeling with Continuous Flows: Sample Complexity of Flow Matching [60.37045080890305]
We provide the first analysis of the sample complexity for flow-matching based generative models.<n>We decompose the velocity field estimation error into neural-network approximation error, statistical error due to the finite sample size, and optimization error due to the finite number of optimization steps for estimating the velocity field.
arXiv Detail & Related papers (2025-12-01T05:14:25Z) - Provable Maximum Entropy Manifold Exploration via Diffusion Models [58.89696361871563]
Exploration is critical for solving real-world decision-making problems such as scientific discovery.<n>We introduce a novel framework that casts exploration as entropy over approximate data manifold implicitly defined by a pre-trained diffusion model.<n>We develop an algorithm based on mirror descent that solves the exploration problem as sequential fine-tuning of a pre-trained diffusion model.
arXiv Detail & Related papers (2025-06-18T11:59:15Z) - Mitigating mode collapse in normalizing flows by annealing with an adaptive schedule: Application to parameter estimation [0.6258471240250307]
We show that an adaptive schedule based on the effective sample size (ESS) can mitigate mode collapse.<n>We demonstrate that our approach can converge the marginal likelihood for a biochemical oscillator model fit to time-series data in ten-fold less time than a widely used ensemble Markov chain Monte Carlo method.
arXiv Detail & Related papers (2025-05-06T15:58:48Z) - Robust Optimization with Diffusion Models for Green Security [49.68562792424776]
In green security, defenders must forecast adversarial behavior, such as poaching, illegal logging, and illegal fishing, to plan effective patrols.<n>We propose a conditional diffusion model for adversary behavior modeling, leveraging its strong distribution-fitting capabilities.<n>We introduce a mixed strategy of mixed strategies and employ a twisted Sequential Monte Carlo (SMC) sampler for accurate sampling.
arXiv Detail & Related papers (2025-02-19T05:30:46Z) - A theoretical perspective on mode collapse in variational inference [8.74105235144778]
We show that mode collapse is present even in statistically favorable scenarios, and identify two key mechanisms driving it: mean alignment and vanishing weight.
Our theoretical findings are consistent with the implementation of variational inference using normalizing flows, a class of popular generative models.
arXiv Detail & Related papers (2024-10-17T07:56:30Z) - Predicting Cascading Failures with a Hyperparametric Diffusion Model [66.89499978864741]
We study cascading failures in power grids through the lens of diffusion models.
Our model integrates viral diffusion principles with physics-based concepts.
We show that this diffusion model can be learned from traces of cascading failures.
arXiv Detail & Related papers (2024-06-12T02:34:24Z) - Theory of Multimode Squeezed Light Generation in Lossy Media [0.0]
A unified theoretical approach to describe the properties of multimode squeezed light generated in a lossy medium is presented.<n>For an important class of Gaussian states, we derive master equations for the second-order correlation functions.<n>Various techniques and strategies to introduce broadband modes can be considered.
arXiv Detail & Related papers (2024-03-08T12:30:34Z) - Bayesian Conditional Diffusion Models for Versatile Spatiotemporal
Turbulence Generation [13.278744447861289]
We introduce a novel generative framework grounded in probabilistic diffusion models for turbulence generation.
A notable feature of our approach is the proposed method for long-span flow sequence generation, which is based on autoregressive-based conditional sampling.
We showcase the versatile turbulence generation capability of our framework through a suite of numerical experiments.
arXiv Detail & Related papers (2023-11-14T04:08:14Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - ModeRNN: Harnessing Spatiotemporal Mode Collapse in Unsupervised
Predictive Learning [75.2748374360642]
We propose ModeRNN, which introduces a novel method to learn hidden structured representations between recurrent states.
Across the entire dataset, different modes result in different responses on the mixtures of slots, which enhances the ability of ModeRNN to build structured representations.
arXiv Detail & Related papers (2021-10-08T03:47:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.