Mode-Assisted Joint Training of Deep Boltzmann Machines
- URL: http://arxiv.org/abs/2102.08562v1
- Date: Wed, 17 Feb 2021 04:03:30 GMT
- Title: Mode-Assisted Joint Training of Deep Boltzmann Machines
- Authors: Haik Manukian and Massimiliano Di Ventra
- Abstract summary: We show that the performance gains of the mode-assisted training are even more dramatic for DBMs.
DBMs jointly trained with the mode-assisted algorithm can represent the same data set with orders of magnitude lower number of parameters.
- Score: 10.292439652458157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The deep extension of the restricted Boltzmann machine (RBM), known as the
deep Boltzmann machine (DBM), is an expressive family of machine learning
models which can serve as compact representations of complex probability
distributions. However, jointly training DBMs in the unsupervised setting has
proven to be a formidable task. A recent technique we have proposed, called
mode-assisted training, has shown great success in improving the unsupervised
training of RBMs. Here, we show that the performance gains of the mode-assisted
training are even more dramatic for DBMs. In fact, DBMs jointly trained with
the mode-assisted algorithm can represent the same data set with orders of
magnitude lower number of total parameters compared to state-of-the-art
training procedures and even with respect to RBMs, provided a fan-in network
topology is also introduced. This substantial saving in number of parameters
makes this training method very appealing also for hardware implementations.
Related papers
- Fast, accurate training and sampling of Restricted Boltzmann Machines [4.785158987724452]
We present an innovative method in which the principal directions of the dataset are integrated into a low-rank RBM.
This approach enables efficient sampling of the equilibrium measure via a static Monte Carlo process.
Our results show that this strategy successfully trains RBMs to capture the full diversity of data in datasets where previous methods fail.
arXiv Detail & Related papers (2024-05-24T09:23:43Z) - Monotone deep Boltzmann machines [86.50247625239406]
Deep Boltzmann machines (DBMs) are multi-layered probabilistic models governed by a pairwise energy function.
We develop a new class of restricted model, the monotone DBM, which allows for arbitrary self-connection in each layer.
We show that a particular choice of activation results in a fixed-point iteration that gives a variational mean-field solution.
arXiv Detail & Related papers (2023-07-11T03:02:44Z) - End-to-end Training of Deep Boltzmann Machines by Unbiased Contrastive
Divergence with Local Mode Initialization [23.008689183810695]
We address the problem of biased gradient estimation in deep Boltzmann machines (DBMs)
We propose a coupling based on the Metropolis-Hastings (MH) and to initialize the state around a local mode of the target distribution.
Because of the propensity of MH to reject proposals, the coupling tends to converge in only one step with a high probability, leading to high efficiency.
arXiv Detail & Related papers (2023-05-31T09:28:02Z) - Guiding Energy-based Models via Contrastive Latent Variables [81.68492940158436]
An energy-based model (EBM) is a popular generative framework that offers both explicit density and architectural flexibility.
There often exists a large gap between EBMs and other generative frameworks like GANs in terms of generation quality.
We propose a novel and effective framework for improving EBMs via contrastive representation learning.
arXiv Detail & Related papers (2023-03-06T10:50:25Z) - From Cloze to Comprehension: Retrofitting Pre-trained Masked Language
Model to Pre-trained Machine Reader [130.45769668885487]
Pre-trained Machine Reader (PMR) is a novel method for retrofitting masked language models (MLMs) to pre-trained machine reading comprehension (MRC) models without acquiring labeled data.
To build the proposed PMR, we constructed a large volume of general-purpose and high-quality MRC-style training data.
PMR has the potential to serve as a unified model for tackling various extraction and classification tasks in the MRC formulation.
arXiv Detail & Related papers (2022-12-09T10:21:56Z) - MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the
Edge [72.16021611888165]
This paper proposes a novel Memory-Economic Sparse Training (MEST) framework targeting for accurate and fast execution on edge devices.
The proposed MEST framework consists of enhancements by Elastic Mutation (EM) and Soft Memory Bound (&S)
Our results suggest that unforgettable examples can be identified in-situ even during the dynamic exploration of sparsity masks.
arXiv Detail & Related papers (2021-10-26T21:15:17Z) - Boltzmann machines as two-dimensional tensor networks [7.041258064903578]
We show that RBM and DBM can be exactly represented as a two-dimensional tensor network.
This representation gives an understanding of the expressive power of RBM and DBM.
Also provides an efficient tensor network contraction algorithm for the computing partition function of RBM and DBM.
arXiv Detail & Related papers (2021-05-10T06:14:49Z) - Fast Ensemble Learning Using Adversarially-Generated Restricted
Boltzmann Machines [0.0]
Restricted Boltzmann Machine (RBM) has received recent attention and relies on an energy-based structure to model data probability distributions.
This work proposes to artificially generate RBMs using Adversarial Learning, where pre-trained weight matrices serve as the GAN inputs.
Experimental results demonstrate the suitability of the proposed approach under image reconstruction and image classification tasks.
arXiv Detail & Related papers (2021-01-04T16:00:47Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - Exact representations of many body interactions with RBM neural networks [77.34726150561087]
We exploit the representation power of RBMs to provide an exact decomposition of many-body contact interactions into one-body operators.
This construction generalizes the well known Hirsch's transform used for the Hubbard model to more complicated theories such as Pionless EFT in nuclear physics.
arXiv Detail & Related papers (2020-05-07T15:59:29Z) - Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines [7.960229223744695]
We show that properly combining standard gradient updates with an off-gradient direction improves their training dramatically over traditional gradient methods.
This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence)
The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures.
arXiv Detail & Related papers (2020-01-15T21:12:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.