Analyzing Convergence in Quantum Neural Networks: Deviations from Neural
Tangent Kernels
- URL: http://arxiv.org/abs/2303.14844v1
- Date: Sun, 26 Mar 2023 22:58:06 GMT
- Title: Analyzing Convergence in Quantum Neural Networks: Deviations from Neural
Tangent Kernels
- Authors: Xuchen You, Shouvanik Chakrabarti, Boyang Chen, Xiaodi Wu
- Abstract summary: A quantum neural network (QNN) is a parameterized mapping efficiently implementable on near-term Noisy Intermediate-Scale Quantum (NISQ) computers.
Despite the existing empirical and theoretical investigations, the convergence of QNN training is not fully understood.
- Score: 20.53302002578558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A quantum neural network (QNN) is a parameterized mapping efficiently
implementable on near-term Noisy Intermediate-Scale Quantum (NISQ) computers.
It can be used for supervised learning when combined with classical
gradient-based optimizers. Despite the existing empirical and theoretical
investigations, the convergence of QNN training is not fully understood.
Inspired by the success of the neural tangent kernels (NTKs) in probing into
the dynamics of classical neural networks, a recent line of works proposes to
study over-parameterized QNNs by examining a quantum version of tangent
kernels. In this work, we study the dynamics of QNNs and show that contrary to
popular belief it is qualitatively different from that of any kernel
regression: due to the unitarity of quantum operations, there is a
non-negligible deviation from the tangent kernel regression derived at the
random initialization. As a result of the deviation, we prove the at-most
sublinear convergence for QNNs with Pauli measurements, which is beyond the
explanatory power of any kernel regression dynamics. We then present the actual
dynamics of QNNs in the limit of over-parameterization. The new dynamics
capture the change of convergence rate during training and implies that the
range of measurements is crucial to the fast QNN convergence.
Related papers
- Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - Symmetric Pruning in Quantum Neural Networks [111.438286016951]
Quantum neural networks (QNNs) exert the power of modern quantum machines.
QNNs with handcraft symmetric ansatzes generally experience better trainability than those with asymmetric ansatzes.
We propose the effective quantum neural tangent kernel (EQNTK) to quantify the convergence of QNNs towards the global optima.
arXiv Detail & Related papers (2022-08-30T08:17:55Z) - Toward Trainability of Deep Quantum Neural Networks [87.04438831673063]
Quantum Neural Networks (QNNs) with random structures have poor trainability due to the exponentially vanishing gradient as the circuit depth and the qubit number increase.
We provide the first viable solution to the vanishing gradient problem for deep QNNs with theoretical guarantees.
arXiv Detail & Related papers (2021-12-30T10:27:08Z) - Quantum-enhanced neural networks in the neural tangent kernel framework [0.4394730767364254]
We study a class of qcNN composed of a quantum data-encoder followed by a cNN.
In the NTK regime where the number nodes of the cNN becomes infinitely large, the output of the entire qcNN becomes a nonlinear function of the so-called projected quantum kernel.
arXiv Detail & Related papers (2021-09-08T17:16:23Z) - Chaos and Complexity from Quantum Neural Network: A study with Diffusion
Metric in Machine Learning [0.0]
We study the phenomena of quantum chaos and complexity in the machine learning dynamics of Quantum Neural Network (QNN)
We employ a statistical and differential geometric approach to study the learning theory of QNN.
arXiv Detail & Related papers (2020-11-16T10:41:47Z) - Toward Trainability of Quantum Neural Networks [87.04438831673063]
Quantum Neural Networks (QNNs) have been proposed as generalizations of classical neural networks to achieve the quantum speed-up.
Serious bottlenecks exist for training QNNs due to the vanishing with gradient rate exponential to the input qubit number.
We show that QNNs with tree tensor and step controlled structures for the application of binary classification. Simulations show faster convergent rates and better accuracy compared to QNNs with random structures.
arXiv Detail & Related papers (2020-11-12T08:32:04Z) - Absence of Barren Plateaus in Quantum Convolutional Neural Networks [0.0]
Quantum Convolutional Neural Networks (QCNNs) have been proposed.
We rigorously analyze the gradient scaling for the parameters in the QCNN architecture.
arXiv Detail & Related papers (2020-11-05T16:46:13Z) - On the learnability of quantum neural networks [132.1981461292324]
We consider the learnability of the quantum neural network (QNN) built on the variational hybrid quantum-classical scheme.
We show that if a concept can be efficiently learned by QNN, then it can also be effectively learned by QNN even with gate noise.
arXiv Detail & Related papers (2020-07-24T06:34:34Z) - Recurrent Quantum Neural Networks [7.6146285961466]
Recurrent neural networks are the foundation of many sequence-to-sequence models in machine learning.
We construct a quantum recurrent neural network (QRNN) with demonstrable performance on non-trivial tasks.
We evaluate the QRNN on MNIST classification, both by feeding the QRNN each image pixel-by-pixel; and by utilising modern data augmentation as preprocessing step.
arXiv Detail & Related papers (2020-06-25T17:59:44Z) - Optimal Rates for Averaged Stochastic Gradient Descent under Neural
Tangent Kernel Regime [50.510421854168065]
We show that the averaged gradient descent can achieve the minimax optimal convergence rate.
We show that the target function specified by the NTK of a ReLU network can be learned at the optimal convergence rate.
arXiv Detail & Related papers (2020-06-22T14:31:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.