Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
- URL: http://arxiv.org/abs/2405.00252v1
- Date: Tue, 30 Apr 2024 23:55:03 GMT
- Title: Hybrid Quantum-Classical Scheduling for Accelerating Neural Network Training with Newton's Gradient Descent
- Authors: Pingzhi Li, Junyu Liu, Hanrui Wang, Tianlong Chen,
- Abstract summary: We propose Q-Newton, a hybrid quantum-classical scheduler for accelerating neural network training with Newton's GD.
Q-Newton utilizes a streamlined scheduling module that coordinates between quantum and classical linear solvers.
Our evaluation showcases the potential for Q-Newton to significantly reduce the total training time compared to commonly used quantum machines.
- Score: 37.59299233291882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimization techniques in deep learning are predominantly led by first-order gradient methodologies, such as SGD. However, neural network training can greatly benefit from the rapid convergence characteristics of second-order optimization. Newton's GD stands out in this category, by rescaling the gradient using the inverse Hessian. Nevertheless, one of its major bottlenecks is matrix inversion, which is notably time-consuming in $O(N^3)$ time with weak scalability. Matrix inversion can be translated into solving a series of linear equations. Given that quantum linear solver algorithms (QLSAs), leveraging the principles of quantum superposition and entanglement, can operate within a $\text{polylog}(N)$ time frame, they present a promising approach with exponential acceleration. Specifically, one of the most recent QLSAs demonstrates a complexity scaling of $O(d\cdot\kappa \log(N\cdot\kappa/\epsilon))$, depending on: {size~$N$, condition number~$\kappa$, error tolerance~$\epsilon$, quantum oracle sparsity~$d$} of the matrix. However, this also implies that their potential exponential advantage may be hindered by certain properties (i.e. $\kappa$ and $d$). We propose Q-Newton, a hybrid quantum-classical scheduler for accelerating neural network training with Newton's GD. Q-Newton utilizes a streamlined scheduling module that coordinates between quantum and classical linear solvers, by estimating & reducing $\kappa$ and constructing $d$ for the quantum solver. Our evaluation showcases the potential for Q-Newton to significantly reduce the total training time compared to commonly used optimizers like SGD. We hypothesize a future scenario where the gate time of quantum machines is reduced, possibly realized by attoseconds physics. Our evaluation establishes an ambitious and promising target for the evolution of quantum computing.
Related papers
- End-to-end complexity for simulating the Schwinger model on quantum computers [0.6449786007855248]
We propose an efficient implementation of block-encoding of the Schwinger model Hamiltonian.
As an end-to-end application, we compute the vacuum persistence amplitude.
Our results provide insights into predictions about the performance of quantum computers in the FTQC era.
arXiv Detail & Related papers (2023-11-29T06:36:11Z) - Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients [51.82488018573326]
We present QP-SBGD, a novel layer-wise optimiser tailored towards training neural networks with binary weights.
BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy.
Our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware.
arXiv Detail & Related papers (2023-10-23T17:32:38Z) - Accelerating the training of single-layer binary neural networks using
the HHL quantum algorithm [58.720142291102135]
We show that useful information can be extracted from the quantum-mechanical implementation of Harrow-Hassidim-Lloyd (HHL)
This paper shows, however, that useful information can be extracted from the quantum-mechanical implementation of HHL, and used to reduce the complexity of finding the solution on the classical side.
arXiv Detail & Related papers (2022-10-23T11:58:05Z) - Quantum State Preparation with Optimal Circuit Depth: Implementations
and Applications [10.436969366019015]
We show that any $Theta(n)$-depth circuit can be prepared with a $Theta(log(nd)) with $O(ndlog d)$ ancillary qubits.
We discuss applications of the results in different quantum computing tasks, such as Hamiltonian simulation, solving linear systems of equations, and realizing quantum random access memories.
arXiv Detail & Related papers (2022-01-27T13:16:30Z) - Deterministic and Entanglement-Efficient Preparation of
Amplitude-Encoded Quantum Registers [0.533024001730262]
A classical vector $mathbfb$ is encoded in the amplitudes of a quantum state.
An arbitrary state of $Q$ qubits generally requires approximately $2Q$ entangling gates.
We present a deterministic (nonvariational) algorithm that allows one to flexibly reduce the quantum resources required for state preparation.
arXiv Detail & Related papers (2021-10-26T07:37:54Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Preparation of excited states for nuclear dynamics on a quantum computer [117.44028458220427]
We study two different methods to prepare excited states on a quantum computer.
We benchmark these techniques on emulated and real quantum devices.
These findings show that quantum techniques designed to achieve good scaling on fault tolerant devices might also provide practical benefits on devices with limited connectivity and gate fidelity.
arXiv Detail & Related papers (2020-09-28T17:21:25Z) - Implementing a Fast Unbounded Quantum Fanout Gate Using Power-Law
Interactions [0.9634136878988853]
Power-law interactions with strength decaying as $1/ralpha$ in the distance provide an experimentally realizable resource for information processing.
We leverage the power of these interactions to implement a fast quantum fanout gate with an arbitrary number of targets.
We show that power-law systems with $alpha le D$ are difficult to simulate classically even for short times, under a standard assumption that factoring is classically intractable.
arXiv Detail & Related papers (2020-07-01T18:00:00Z) - Quantum Gram-Schmidt Processes and Their Application to Efficient State
Read-out for Quantum Algorithms [87.04438831673063]
We present an efficient read-out protocol that yields the classical vector form of the generated state.
Our protocol suits the case that the output state lies in the row space of the input matrix.
One of our technical tools is an efficient quantum algorithm for performing the Gram-Schmidt orthonormal procedure.
arXiv Detail & Related papers (2020-04-14T11:05:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.