Self-Adapting Noise-Contrastive Estimation for Energy-Based Models
- URL: http://arxiv.org/abs/2211.02650v1
- Date: Thu, 3 Nov 2022 15:17:43 GMT
- Title: Self-Adapting Noise-Contrastive Estimation for Energy-Based Models
- Authors: Nathaniel Xu
- Abstract summary: Training energy-based models with noise-contrastive estimation (NCE) is theoretically feasible but practically challenging.
Previous works have explored modelling the noise distribution as a separate generative model, and then concurrently training this noise model with the EBM.
This thesis proposes a self-adapting NCE algorithm which uses static instances of the EBM along its training trajectory as the noise distribution.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training energy-based models (EBMs) with noise-contrastive estimation (NCE)
is theoretically feasible but practically challenging. Effective learning
requires the noise distribution to be approximately similar to the target
distribution, especially in high-dimensional domains. Previous works have
explored modelling the noise distribution as a separate generative model, and
then concurrently training this noise model with the EBM. While this method
allows for more effective noise-contrastive estimation, it comes at the cost of
extra memory and training complexity. Instead, this thesis proposes a
self-adapting NCE algorithm which uses static instances of the EBM along its
training trajectory as the noise distribution. During training, these static
instances progressively converge to the target distribution, thereby
circumventing the need to simultaneously train an auxiliary noise model.
Moreover, we express this self-adapting NCE algorithm in the framework of
Bregman divergences and show that it is a generalization of maximum likelihood
learning for EBMs. The performance of our algorithm is evaluated across a range
of noise update intervals, and experimental results show that shorter update
intervals are conducive to higher synthesis quality.
Related papers
- Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - Not All Steps are Equal: Efficient Generation with Progressive Diffusion
Models [62.155612146799314]
We propose a novel two-stage training strategy termed Step-Adaptive Training.
In the initial stage, a base denoising model is trained to encompass all timesteps.
We partition the timesteps into distinct groups, fine-tuning the model within each group to achieve specialized denoising capabilities.
arXiv Detail & Related papers (2023-12-20T03:32:58Z) - May the Noise be with you: Adversarial Training without Adversarial
Examples [3.4673556247932225]
We investigate the question: Can we obtain adversarially-trained models without training on adversarial?
Our proposed approach incorporates inherentity by embedding Gaussian noise within the layers of the NN model at training time.
Our work contributes adversarially trained networks using a completely different approach, with empirically similar robustness to adversarial training.
arXiv Detail & Related papers (2023-12-12T08:22:28Z) - One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference.
OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - A Spectral Energy Distance for Parallel Speech Synthesis [29.14723501889278]
Speech synthesis is an important practical generative modeling problem.
We propose a new learning method that allows us to train highly parallel models of speech.
arXiv Detail & Related papers (2020-08-03T19:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.