Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients
- URL: http://arxiv.org/abs/2310.15128v2
- Date: Tue, 3 Sep 2024 19:55:22 GMT
- Title: Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients
- Authors: Maximilian Krahn, Michele Sasdelli, Fengyi Yang, Vladislav Golyanik, Juho Kannala, Tat-Jun Chin, Tolga Birdal,
- Abstract summary: We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights.
BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy.
Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
- Score: 51.82488018573326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers either rely on projected updates or binarise weights post-training. Instead, QP-SBGD approximately maps the gradient onto binary variables, by solving a quadratic constrained binary optimisation. Under practically reasonable assumptions, we show that this update rule converges with a rate of $\mathcal{O}(1 / \sqrt{T})$. Moreover, we show how the $\mathcal{NP}$-hard projection can be effectively executed on an adiabatic quantum annealer, harnessing recent advancements in quantum computation. We also introduce a projected version of this update rule and prove that if a fixed point exists in the binary variable space, the modified updates will converge to it. Last but not least, our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware. Through extensive evaluations, we show that QP-SBGD outperforms or is on par with competitive and well-established baselines such as BinaryConnect, signSGD and ProxQuant when optimising the Rosenbrock function, training BNNs as well as binary graph neural networks.
Related papers
- Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent [37.59299233291882]
We propose Q-Newton, a hybrid quantum-classical scheduler for accelerating neural network training with Newton's GD.
Q-Newton utilizes a streamlined scheduling module that coordinates between quantum and classical linear solvers.
Our evaluation showcases the potential for Q-Newton to significantly reduce the total training time compared to commonly used quantum machines.
arXiv Detail & Related papers (2024-04-30T23:55:03Z) - Understanding Neural Network Binarization with Forward and Backward
Proximal Quantizers [26.27829662433536]
In neural network binarization, BinaryConnect (BC) and its variants are considered the standard.
We aim at shedding some light on these training tricks from the optimization perspective.
arXiv Detail & Related papers (2024-02-27T17:43:51Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Toward Trainability of Quantum Neural Networks [87.04438831673063]
Quantum Neural Networks (QNNs) have been proposed as generalizations of classical neural networks to achieve the quantum speed-up.
Serious bottlenecks exist for training QNNs due to the vanishing with gradient rate exponential to the input qubit number.
We show that QNNs with tree tensor and step controlled structures for the application of binary classification. Simulations show faster convergent rates and better accuracy compared to QNNs with random structures.
arXiv Detail & Related papers (2020-11-12T08:32:04Z) - Absence of Barren Plateaus in Quantum Convolutional Neural Networks [0.0]
Quantum Convolutional Neural Networks (QCNNs) have been proposed.
We rigorously analyze the gradient scaling for the parameters in the QCNN architecture.
arXiv Detail & Related papers (2020-11-05T16:46:13Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.