Related papers: Mean-Field Assisted Deep Boltzmann Learning with Probabilistic Computers

Mean-Field Assisted Deep Boltzmann Learning with Probabilistic Computers

URL: http://arxiv.org/abs/2401.01996v1
Date: Wed, 3 Jan 2024 22:19:57 GMT
Title: Mean-Field Assisted Deep Boltzmann Learning with Probabilistic Computers
Authors: Shuvro Chowdhury, Shaila Niazi, Kerem Y. Camsari
Abstract summary: We show that deep and unrestricted Boltzmann Machines can be trained using p-computers generating hundreds of billions of Markov Chain Monte Carlo samples per second. A custom Field-Programmable-Gate Array (FPGA) emulation of the p-computer architecture takes up to 45 billion flips per second. Our algorithm can be used in other scalable Ising machines and its variants can be used to train BMs, previously thought to be intractable.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite their appeal as physics-inspired, energy-based and generative nature, general Boltzmann Machines (BM) are considered intractable to train. This belief led to simplified models of BMs with restricted intralayer connections or layer-by-layer training of deep BMs. Recent developments in domain-specific hardware -- specifically probabilistic computers (p-computer) with probabilistic bits (p-bit) -- may change established wisdom on the tractability of deep BMs. In this paper, we show that deep and unrestricted BMs can be trained using p-computers generating hundreds of billions of Markov Chain Monte Carlo (MCMC) samples per second, on sparse networks developed originally for use in D-Wave's annealers. To maximize the efficiency of learning the p-computer, we introduce two families of Mean-Field Theory assisted learning algorithms, or xMFTs (x = Naive and Hierarchical). The xMFTs are used to estimate the averages and correlations during the positive phase of the contrastive divergence (CD) algorithm and our custom-designed p-computer is used to estimate the averages and correlations in the negative phase. A custom Field-Programmable-Gate Array (FPGA) emulation of the p-computer architecture takes up to 45 billion flips per second, allowing the implementation of CD-$n$ where $n$ can be of the order of millions, unlike RBMs where $n$ is typically 1 or 2. Experiments on the full MNIST dataset with the combined algorithm show that the positive phase can be efficiently computed by xMFTs without much degradation when the negative phase is computed by the p-computer. Our algorithm can be used in other scalable Ising machines and its variants can be used to train BMs, previously thought to be intractable.

Related papers

Sample-efficient Learning of Infinite-horizon Average-reward MDPs with General Function Approximation [53.17668583030862]
We study infinite-horizon average-reward Markov decision processes (AMDPs) in the context of general function approximation. We propose a novel algorithmic framework named Local-fitted Optimization with OPtimism (LOOP) We show that LOOP achieves a sublinear $tildemathcalO(mathrmpoly(d, mathrmsp(V*)) sqrtTbeta )$ regret, where $d$ and $beta$ correspond to AGEC and log-covering number of the hypothesis class respectively
arXiv Detail & Related papers (2024-04-19T06:24:22Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning [0.16365624921211983]
Moore's law by augmenting complementary-metal-oxide semiconductor (CMOS) transistors with emerging nanotechnologies (X) has become increasingly important. One important class of problems involve sampling-based Monte Carlo algorithms used in probabilistic machine learning, optimization, and quantum simulation. Here, we combine magnetic tunnel junction (sMTJ)-based probabilistic bits (p-bits) with Field Programmable Gate Arrays (FPGA) to create an energy-efficient CMOS + X prototype.
arXiv Detail & Related papers (2023-04-12T16:18:12Z)
Non-Generative Energy Based Models [3.1447898427012473]
Energy-based models (EBM) have become increasingly popular within computer vision. We propose a non-generative training approach, Non-Generative EBM (NG-EBM) We show that our NG-EBM training strategy retains many of the benefits of EBM in calibration, out-of-distribution detection, and adversarial resistance.
arXiv Detail & Related papers (2023-04-03T18:47:37Z)
Training Deep Boltzmann Networks with Sparse Ising Machines [5.048818298702389]
We show a new application domain for probabilistic bit (p-bit) based Ising machines by training deep generative AI models with them. Using sparse, asynchronous, and massively parallel Ising machines we train deep Boltzmann networks in a hybrid probabilistic-classical computing setup.
arXiv Detail & Related papers (2023-03-19T18:10:15Z)
Memristive Stochastic Computing for Deep Learning Parameter Optimization [1.6344851071810071]
Computing (SC) is a computing paradigm that allows for the low-cost and low-power of various arithmetic operations using bit streams and digital logic. We demonstrate that in using a 40-nm Complementary Metal Oxide Semiconductor (CMOS) process our scalable architecture occupies 1.55mm$2$ and consumes approximately 167$mu$W when optimizing parameters of a Convolutional Neural Network (CNN) while it is being trained for a character recognition task, observing no notable reduction in accuracy post-training.
arXiv Detail & Related papers (2021-03-11T07:10:32Z)
Efficient semidefinite-programming-based inference for binary and multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF. We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z)
Preparation of excited states for nuclear dynamics on a quantum computer [117.44028458220427]
We study two different methods to prepare excited states on a quantum computer. We benchmark these techniques on emulated and real quantum devices. These findings show that quantum techniques designed to achieve good scaling on fault tolerant devices might also provide practical benefits on devices with limited connectivity and gate fidelity.
arXiv Detail & Related papers (2020-09-28T17:21:25Z)
Predictive Coding Approximates Backprop along Arbitrary Computation Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents. Our models perform equivalently to backprop on challenging machine learning benchmarks. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits [99.59941892183454]
We propose Einsum Networks (EiNets), a novel implementation design for PCs. At their core, EiNets combine a large number of arithmetic operations in a single monolithic einsum-operation. We show that the implementation of Expectation-Maximization (EM) can be simplified for PCs, by leveraging automatic differentiation.
arXiv Detail & Related papers (2020-04-13T23:09:15Z)
A High-Performance Implementation of Bayesian Matrix Factorization with Limited Communication [10.639704288188767]
Matrix factorization algorithms can quantify uncertainty in their predictions and avoid over-fitting. They have not been widely used on large-scale data because of their prohibitive computational cost. We show that the state-of-the-art of both approaches to scalability can be combined.
arXiv Detail & Related papers (2020-04-06T11:16:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.