Related papers: Performance Evaluation of Ising and QUBO Variable Encodings in Boltzmann Machine Learning

Performance Evaluation of Ising and QUBO Variable Encodings in Boltzmann Machine Learning

URL: http://arxiv.org/abs/2510.13210v1
Date: Wed, 15 Oct 2025 06:57:23 GMT
Title: Performance Evaluation of Ising and QUBO Variable Encodings in Boltzmann Machine Learning
Authors: Yasushi Hasegawa, Masayuki Ohzeki,
Abstract summary: QUBO induces larger cross terms between first- and second-order statistics, creating more small-eigenvalue directions in the Fisher information matrix.<n>Ising encoding provides more isotropic curvature and faster convergence.<n>These results clarify how representation shapes information geometry and finite-time learning dynamics in Boltzmann machines.
Score: 0.7734726150561088
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We compare Ising ({-1,+1}) and QUBO ({0,1}) encodings for Boltzmann machine learning under a controlled protocol that fixes the model, sampler, and step size. Exploiting the identity that the Fisher information matrix (FIM) equals the covariance of sufficient statistics, we visualize empirical moments from model samples and reveal systematic, representation-dependent differences. QUBO induces larger cross terms between first- and second-order statistics, creating more small-eigenvalue directions in the FIM and lowering spectral entropy. This ill-conditioning explains slower convergence under stochastic gradient descent (SGD). In contrast, natural gradient descent (NGD)-which rescales updates by the FIM metric-achieves similar convergence across encodings due to reparameterization invariance. Practically, for SGD-based training, the Ising encoding provides more isotropic curvature and faster convergence; for QUBO, centering/scaling or NGD-style preconditioning mitigates curvature pathologies. These results clarify how representation shapes information geometry and finite-time learning dynamics in Boltzmann machines and yield actionable guidelines for variable encoding and preprocessing.

Related papers

Zero-Variance Gradients for Variational Autoencoders [32.818968022327866]
Training deep generative models like Variational Autoencoders (VAEs) is often hindered by the need to backpropagate gradients through sampling of their latent variables.<n>In this paper, we propose a new perspective that sidesteps this problem, which we call Silent Gradients.<n>Instead of improving estimators, we leverage specific decoder architectures analytically to compute the expected ELBO, yielding a gradient with zero variance.
arXiv Detail & Related papers (2025-08-05T15:54:21Z)
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions [31.75902683077129]
The Restricted Boltzmann Machine (RBM) is one of the simplest generative neural networks capable of learning input distributions.<n>We simplify the standard RBM training objective into a form that is equivalent to the multi-index model with non-separable regularization.<n>We show in particular that RBM reaches the optimal computational weak recovery threshold, aligning with the BBP transition.
arXiv Detail & Related papers (2025-05-23T15:51:46Z)
Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts [64.34482582690927]
We provide an efficient and principled method for sampling from a sequence of annealed, geometric-averaged, or product distributions derived from pretrained score-based models.<n>We propose Sequential Monte Carlo (SMC) resampling algorithms that leverage inference-time scaling to improve sampling quality.
arXiv Detail & Related papers (2025-03-04T17:46:51Z)
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [55.95767828747407]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer [45.47667026025716]
We propose a novel, robust and accelerated iteration that relies on two key elements. The convergence and stability of the obtained method, referred to as NAG-GS, are first studied extensively. We show that NAG-arity is competitive with state-the-art methods such as momentum SGD with weight decay and AdamW for the training of machine learning models.
arXiv Detail & Related papers (2022-09-29T16:54:53Z)
Emulating Spatio-Temporal Realizations of Three-Dimensional Isotropic Turbulence via Deep Sequence Learning Models [24.025975236316842]
We use a data-driven approach to model a three-dimensional turbulent flow using cutting-edge Deep Learning techniques. The accuracy of the model is assessed using statistical and physics-based metrics.
arXiv Detail & Related papers (2021-12-07T03:33:39Z)
A Fast Parallel Tensor Decomposition with Optimal Stochastic Gradient Descent: an Application in Structural Damage Identification [1.536989504296526]
We propose a novel algorithm, FP-CPD, to parallelize the CANDECOMP/PARAFAC (CP) decomposition of a tensor $mathcalX in mathbbR I_1 times dots times I_N $.
arXiv Detail & Related papers (2021-11-04T05:17:07Z)
Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores) For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training. We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior. We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.