Gradient-Guided Importance Sampling for Learning Binary Energy-Based
Models
- URL: http://arxiv.org/abs/2210.05782v1
- Date: Tue, 11 Oct 2022 20:52:48 GMT
- Title: Gradient-Guided Importance Sampling for Learning Binary Energy-Based
Models
- Authors: Meng Liu, Haoran Liu, Shuiwang Ji
- Abstract summary: We propose ratio matching with gradient-guided importance sampling (RMwGGIS) to learn energy-based models (EBMs) on high-dimensional data.
We perform experiments on density modeling over synthetic discrete data, graph generation, and training Ising models to evaluate our proposed method.
Our method can significantly alleviate the limitations of ratio matching, perform more effectively in practice, and scale to high-dimensional problems.
- Score: 46.87187776084161
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning energy-based models (EBMs) is known to be difficult especially on
discrete data where gradient-based learning strategies cannot be applied
directly. Although ratio matching is a sound method to learn discrete EBMs, it
suffers from expensive computation and excessive memory requirement, thereby
resulting in difficulties for learning EBMs on high-dimensional data. Motivated
from these limitations, in this study, we propose ratio matching with
gradient-guided importance sampling (RMwGGIS). Particularly, we use the
gradient of the energy function w.r.t. the discrete data space to approximately
construct the provably optimal proposal distribution, which is subsequently
used by importance sampling to efficiently estimate the original ratio matching
objective. We perform experiments on density modeling over synthetic discrete
data, graph generation, and training Ising models to evaluate our proposed
method. The experimental results demonstrate that our method can significantly
alleviate the limitations of ratio matching, perform more effectively in
practice, and scale to high-dimensional problems. Our implementation is
available at {https://github.com/divelab/RMwGGIS.
Related papers
- Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Efficient Training of Energy-Based Models Using Jarzynski Equality [13.636994997309307]
Energy-based models (EBMs) are generative models inspired by statistical physics.
The computation of its gradient with respect to the model parameters requires sampling the model distribution.
Here we show how results for nonequilibrium thermodynamics based on Jarzynski equality can be used to perform this computation efficiently.
arXiv Detail & Related papers (2023-05-30T21:07:52Z) - Moment Matching Denoising Gibbs Sampling [14.75945343063504]
Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions.
The widely-used Denoising Score Matching (DSM) method for scalable EBM training suffers from inconsistency issues.
We propose an efficient sampling framework: (pseudo)-Gibbs sampling with moment matching.
arXiv Detail & Related papers (2023-05-19T12:58:25Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Score-based diffusion models for accelerated MRI [35.3148116010546]
We introduce a way to sample data from a conditional distribution given the measurements, such that the model can be readily used for solving inverse problems in imaging.
Our model requires magnitude images only for training, and yet is able to reconstruct complex-valued data, and even extends to parallel imaging.
arXiv Detail & Related papers (2021-10-08T08:42:03Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Domain Knowledge Integration By Gradient Matching For Sample-Efficient
Reinforcement Learning [0.0]
We propose a gradient matching algorithm to improve sample efficiency by utilizing target slope information from the dynamics to aid the model-free learner.
We demonstrate this by presenting a technique for matching the gradient information from the model-based learner with the model-free component in an abstract low-dimensional space.
arXiv Detail & Related papers (2020-05-28T05:02:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.