Unlocking the Power of Boltzmann Machines by Parallelizable Sampler and Efficient Temperature Estimation
- URL: http://arxiv.org/abs/2512.02323v1
- Date: Tue, 02 Dec 2025 01:40:50 GMT
- Title: Unlocking the Power of Boltzmann Machines by Parallelizable Sampler and Efficient Temperature Estimation
- Authors: Kentaro Kubo, Hayato Goto,
- Abstract summary: We propose a new Boltzmann sampler inspired by a quantum-inspired optimization called simulated bifurcation (SB)<n>LSB can control the inverse temperature of the output Boltzmann distribution, which hinders learning and degrades performance.<n>We also develop an efficient method for estimating the inverse temperature during the learning process, which we call conditional expectation matching (CEM)
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Boltzmann machines (BMs) are powerful energy-based generative models, but their heavy training cost has largely confined practical use to Restricted BMs (RBMs) trained with an efficient learning method called contrastive divergence. More accurate learning typically requires Markov chain Monte Carlo (MCMC) Boltzmann sampling, but it is time-consuming due to the difficulty of parallelization for more expressive models. To address this limitation, we first propose a new Boltzmann sampler inspired by a quantum-inspired combinatorial optimization called simulated bifurcation (SB). This SB-inspired approach, which we name Langevin SB (LSB), enables parallelized sampling while maintaining accuracy comparable to MCMC. Furthermore, this is applicable not only to RBMs but also to BMs with general couplings. However, LSB cannot control the inverse temperature of the output Boltzmann distribution, which hinders learning and degrades performance. To overcome this limitation, we also developed an efficient method for estimating the inverse temperature during the learning process, which we call conditional expectation matching (CEM). By combining LSB and CEM, we establish an efficient learning framework for BMs with greater expressive power than RBMs. We refer to this framework as sampler-adaptive learning (SAL). SAL opens new avenues for energy-based generative modeling beyond RBMs.
Related papers
- Self-Rewarding Sequential Monte Carlo for Masked Diffusion Language Models [58.946955321428845]
This work presents self-rewarding sequential Monte Carlo (SMC)<n>Our algorithm stems from the observation that most existing MDLMs rely on a confidence-based sampling strategy.<n>We introduce the trajectory-level confidence as a self-rewarding signal for assigning particle importance weights.
arXiv Detail & Related papers (2026-02-02T09:21:45Z) - FALCON: Few-step Accurate Likelihoods for Continuous Flows [78.37361800856583]
We propose Few-step Accurate Likelihoods for Continuous Flows (FALCON), which allows for few-step sampling with a likelihood accurate enough for importance sampling applications.<n>We show FALCON outperforms state-of-the-art normalizing flow models for molecular Boltzmann sampling and is two orders of magnitude faster than the equivalently performing CNF model.
arXiv Detail & Related papers (2025-12-10T18:47:25Z) - Diabatic quantum annealing for training energy-based generative models [0.19116784879310023]
Energy-based generative models, such as restricted Boltzmann machines (RBMs), require unbiased Boltzmann samples for effective training.<n>We address this bottleneck by applying the analytic relation between annealing schedules and effective inverse temperature.<n>By implementing this prescription on a quantum annealer, we obtain temperature-controlled Boltzmann samples that enable RBM training with faster convergence and lower validation error.
arXiv Detail & Related papers (2025-09-11T11:47:33Z) - BoltzNCE: Learning Likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation [1.2874523233023452]
Efficient sampling from the Boltzmann distribution is a key challenge for modeling complex physical systems such as molecules.<n>We train an energy-based model (EBM) to approximate likelihoods using both noise contrastive estimation (NCE) and score matching.<n>Our approach also exhibits effective transfer learning, generalizing to new systems at inference time and achieving at least a $6times$ speedup over standard MD.
arXiv Detail & Related papers (2025-07-01T15:18:28Z) - Light and Optimal Schrödinger Bridge Matching [67.93806073192938]
We propose a novel procedure to learn Schr"odinger Bridges (SB) which we call the textbf Schr"odinger bridge matching.
We show that the optimal bridge matching objective coincides with the recently discovered energy-based modeling (EBM) objectives to learn EOT/SB.
We develop a light solver (which we call LightSB-M) to implement optimal matching in practice using the mixture parameterization of the Schr"odinger potential.
arXiv Detail & Related papers (2024-02-05T17:17:57Z) - Learning Energy-based Model via Dual-MCMC Teaching [5.31573596283377]
Learning the energy-based model (EBM) can be achieved using the maximum likelihood estimation (MLE)
This paper studies the fundamental learning problem of the energy-based model (EBM)
arXiv Detail & Related papers (2023-12-05T03:39:54Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level
Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling.
We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z) - Guiding Energy-based Models via Contrastive Latent Variables [81.68492940158436]
An energy-based model (EBM) is a popular generative framework that offers both explicit density and architectural flexibility.
There often exists a large gap between EBMs and other generative frameworks like GANs in terms of generation quality.
We propose a novel and effective framework for improving EBMs via contrastive representation learning.
arXiv Detail & Related papers (2023-03-06T10:50:25Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - No MCMC for me: Amortized sampling for fast and stable training of
energy-based models [62.1234885852552]
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.
We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling.
Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
arXiv Detail & Related papers (2020-10-08T19:17:20Z) - Mode-Assisted Unsupervised Learning of Restricted Boltzmann Machines [7.960229223744695]
We show that properly combining standard gradient updates with an off-gradient direction improves their training dramatically over traditional gradient methods.
This approach, which we call mode training, promotes faster training and stability, in addition to lower converged relative entropy (KL divergence)
The mode training we suggest is quite versatile, as it can be applied in conjunction with any given gradient method, and is easily extended to more general energy-based neural network structures.
arXiv Detail & Related papers (2020-01-15T21:12:44Z) - Elastic Bulk Synchronous Parallel Model for Distributed Deep Learning [17.798727574458514]
The proposed model offers more flexibility and adaptability during the training phase, without sacrificing on the accuracy of the trained model.
A thorough experimental evaluation demonstrates that our proposed ELASTICBSP model converges faster and to a higher accuracy than the classic BSP.
arXiv Detail & Related papers (2020-01-06T01:05:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.