Related papers: Efficient quantum machine learning with inverse-probability algebraic corrections

Efficient quantum machine learning with inverse-probability algebraic corrections

URL: http://arxiv.org/abs/2601.16665v1
Date: Fri, 23 Jan 2026 11:28:53 GMT
Title: Efficient quantum machine learning with inverse-probability algebraic corrections
Authors: Jaemin Seo,
Abstract summary: Quantum neural networks (QNNs) provide expressive probabilistic models by leveraging quantum superposition and entanglement.<n>Quantum neural networks (QNNs) provide expressive probabilistic models by leveraging quantum superposition and entanglement.<n>Existing training approaches largely rely on gradient-based procedural optimization.
Score: 2.7412662946127764
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quantum neural networks (QNNs) provide expressive probabilistic models by leveraging quantum superposition and entanglement, yet their practical training remains challenging due to highly oscillatory loss landscapes and noise inherent to near-term quantum devices. Existing training approaches largely rely on gradient-based procedural optimization, which often suffers from slow convergence, sensitivity to hyperparameters, and instability near sharp minima. In this work, we propose an alternative inverse-probability algebraic learning framework for QNNs. Instead of updating parameters through incremental gradient descent, our method treats learning as a local inverse problem in probability space, directly mapping discrepancies between predicted and target Born-rule probabilities to parameter corrections via a pseudo-inverse of the Jacobian. This algebraic update is covariant, does not require learning-rate tuning, and enables rapid movement toward the vicinity of a loss minimum in a single step. We systematically compare the proposed method with gradient descent and Adam optimization in both regression and classification tasks using a teacher-student QNN benchmark. Our results show that algebraic learning converges significantly faster, escapes loss plateaus, and achieves lower final errors. Under finite-shot sampling, the method exhibits near-optimal error scaling, while remaining robust against intrinsic hardware noise such as dephasing. These findings suggest that inverse-probability algebraic learning offers a principled and practical alternative to procedural optimization for QNN training, particularly in resource-constrained near-term quantum devices.

Related papers

Escaping Barren Plateaus in Variational Quantum Algorithms Using Negative Learning Rate in Quantum Internet of Things [8.98664000532717]
Variational Quantum Algorithms (VQAs) are becoming the primary computational primitive for next-generation quantum computers.<n>Under device-constrained execution conditions, the scalability of learning is severely limited by barren plateaus.<n>We present a novel approach for escaping barren plateaus by including negative learning rates into the optimization process.
arXiv Detail & Related papers (2025-11-28T03:32:33Z)
An Evolutionary Multi-objective Optimization for Replica-Exchange-based Physics-informed Operator Learning Network [7.1950116347185995]
We propose an evolutionary Multi-objective Optimization for Replica-based Physics-informed Operator learning Network.<n>Our framework consistently outperforms the general operator learning methods in accuracy, noise, and the ability to quantify uncertainty.
arXiv Detail & Related papers (2025-08-31T02:17:59Z)
Progressive Element-wise Gradient Estimation for Neural Network Quantization [2.1413624861650358]
Quantization-Aware Training (QAT) methods rely on the Straight-Through Estimator (STE) to address the non-differentiability of discretization functions.<n>We propose Progressive Element-wise Gradient Estimation (PEGE) to address discretization errors between continuous and quantized values.<n>PEGE consistently outperforms existing backpropagation methods and enables low-precision models to match or even outperform the accuracy of their full-precision counterparts.
arXiv Detail & Related papers (2025-08-27T15:59:36Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
Learning rate adaptive stochastic gradient descent optimization methods: numerical simulations for deep learning methods for partial differential equations and convergence analyses [5.052293146674794]
It is known that the standard descent (SGD) optimization method, as well as accelerated and adaptive SGD optimization methods such as the Adam fail to converge if the learning rates do not converge to zero. In this work we propose and study a learning-rate-adaptive approach for SGD optimization methods in which the learning rate is adjusted based on empirical estimates.
arXiv Detail & Related papers (2024-06-20T14:07:39Z)
Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks. We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z)
NIPQ: Noise proxy-based Integrated Pseudo-Quantization [9.207644534257543]
Straight-through estimator (STE) incurs unstable convergence during quantization-aware training (QAT) We propose a novel noise proxy-based integrated pseudoquantization (NIPQ) that enables unified support of pseudoquantization for both activation and weight. NIPQ outperforms existing quantization algorithms in various vision and language applications by a large margin.
arXiv Detail & Related papers (2022-06-02T01:17:40Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications. Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.