Designing losses for data-free training of normalizing flows on
Boltzmann distributions
- URL: http://arxiv.org/abs/2301.05475v1
- Date: Fri, 13 Jan 2023 10:56:13 GMT
- Title: Designing losses for data-free training of normalizing flows on
Boltzmann distributions
- Authors: Loris Felardos (TAU), J\'er\^ome H\'enin (LBT (UPR\_9080), IBPC
(FR\_550)), Guillaume Charpiat (TAU)
- Abstract summary: We analyze the properties of standard losses based on Kullback-Leibler divergences.
We propose strategies to alleviate these issues, most importantly a new loss function well-grounded in theory.
We show on several tasks that, for the first time, imperfect pre-trained models can be further optimized in the absence of training data.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating a Boltzmann distribution in high dimension has recently been
achieved with Normalizing Flows, which enable fast and exact computation of the
generated density, and thus unbiased estimation of expectations. However,
current implementations rely on accurate training data, which typically comes
from computationally expensive simulations. There is therefore a clear
incentive to train models with incomplete or no data by relying solely on the
target density, which can be obtained from a physical energy model (up to a
constant factor). For that purpose, we analyze the properties of standard
losses based on Kullback-Leibler divergences. We showcase their limitations, in
particular a strong propensity for mode collapse during optimization on
high-dimensional distributions. We then propose strategies to alleviate these
issues, most importantly a new loss function well-grounded in theory and with
suitable optimization properties. Using as a benchmark the generation of 3D
molecular configurations, we show on several tasks that, for the first time,
imperfect pre-trained models can be further optimized in the absence of
training data.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Uncertainty quantification and inverse modeling for subsurface flow in
3D heterogeneous formations using a theory-guided convolutional
encoder-decoder network [5.018057056965207]
We build surrogate models for dynamic 3D subsurface single-phase flow problems with multiple vertical producing wells.
The surrogate model provides efficient pressure estimation of the entire formation at any timestep.
The well production rate or bottom hole pressure can then be determined based on Peaceman's formula.
arXiv Detail & Related papers (2021-11-14T10:11:46Z) - Resampling Base Distributions of Normalizing Flows [0.0]
We introduce a base distribution for normalizing flows based on learned rejection sampling.
We develop suitable learning algorithms using both maximizing the log-likelihood and the optimization of the reverse Kullback-Leibler divergence.
arXiv Detail & Related papers (2021-10-29T14:44:44Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Last Layer Marginal Likelihood for Invariance Learning [12.00078928875924]
We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions.
We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer.
arXiv Detail & Related papers (2021-06-14T15:40:51Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - TraDE: Transformers for Density Estimation [101.20137732920718]
TraDE is a self-attention-based architecture for auto-regressive density estimation.
We present a suite of tasks such as regression using generated samples, out-of-distribution detection, and robustness to noise in the training data.
arXiv Detail & Related papers (2020-04-06T07:32:51Z) - Learning Generative Models using Denoising Density Estimators [29.068491722778827]
We introduce a new generative model based on denoising density estimators (DDEs)
Our main contribution is a novel technique to obtain generative models by minimizing the KL-divergence directly.
Experimental results demonstrate substantial improvement in density estimation and competitive performance in generative model training.
arXiv Detail & Related papers (2020-01-08T20:30:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.