No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers
- URL: http://arxiv.org/abs/2502.06685v1
- Date: Mon, 10 Feb 2025 17:13:11 GMT
- Title: No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers
- Authors: Jiajun He, Yuanqi Du, Francisco Vargas, Dinghuai Zhang, Shreyas Padhy, RuiKang OuYang, Carla Gomes, José Miguel Hernández-Lobato,
- Abstract summary: We consider the sampling problem, where the aim is to draw samples from a distribution whose density is known only up to a normalization constant.
Recent breakthroughs in generative modeling to approximate a high-dimensional data distribution have sparked significant interest in developing neural network-based methods for this challenging problem.
We propose an elegant modification to previous methods, which allows simulation-free training with the help of a time-dependent normalizing flow.
- Score: 41.867855070932706
- License:
- Abstract: We consider the sampling problem, where the aim is to draw samples from a distribution whose density is known only up to a normalization constant. Recent breakthroughs in generative modeling to approximate a high-dimensional data distribution have sparked significant interest in developing neural network-based methods for this challenging problem. However, neural samplers typically incur heavy computational overhead due to simulating trajectories during training. This motivates the pursuit of simulation-free training procedures of neural samplers. In this work, we propose an elegant modification to previous methods, which allows simulation-free training with the help of a time-dependent normalizing flow. However, it ultimately suffers from severe mode collapse. On closer inspection, we find that nearly all successful neural samplers rely on Langevin preconditioning to avoid mode collapsing. We systematically analyze several popular methods with various objective functions and demonstrate that, in the absence of Langevin preconditioning, most of them fail to adequately cover even a simple target. Finally, we draw attention to a strong baseline by combining the state-of-the-art MCMC method, Parallel Tempering (PT), with an additional generative model to shed light on future explorations of neural samplers.
Related papers
- Unsupervised textile defect detection using convolutional neural
networks [0.0]
We propose a novel motif-based approach for unsupervised textile anomaly detection.
It combines the benefits of traditional convolutional neural networks with those of an unsupervised learning paradigm.
We demonstrate the effectiveness of our approach on the Patterned Fabrics benchmark dataset.
arXiv Detail & Related papers (2023-11-30T22:08:06Z) - Diffusion-Model-Assisted Supervised Learning of Generative Models for
Density Estimation [10.793646707711442]
We present a framework for training generative models for density estimation.
We use the score-based diffusion model to generate labeled data.
Once the labeled data are generated, we can train a simple fully connected neural network to learn the generative model in the supervised manner.
arXiv Detail & Related papers (2023-10-22T23:56:19Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Minimizing Trajectory Curvature of ODE-based Generative Models [45.89620603363946]
Recent generative models, such as diffusion models, rectified flows, and flow matching, define a generative process as a time reversal of a fixed forward process.
We present an efficient method of training the forward process to minimize the curvature of generative trajectories without any ODE/SDE simulation.
arXiv Detail & Related papers (2023-01-27T21:52:03Z) - Simple lessons from complex learning: what a neural network model learns
about cosmic structure formation [7.270598539996841]
We train a neural network model to predict the full phase space evolution of cosmological N-body simulations.
Our model achieves percent level accuracy at nonlinear scales of $ksim 1 mathrmMpc-1, h$, representing a significant improvement over COLA.
arXiv Detail & Related papers (2022-06-09T15:41:09Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.