A Unified Theory of Quantum Neural Network Loss Landscapes
- URL: http://arxiv.org/abs/2408.11901v3
- Date: Mon, 7 Oct 2024 21:58:50 GMT
- Title: A Unified Theory of Quantum Neural Network Loss Landscapes
- Authors: Eric R. Anschuetz,
- Abstract summary: Quantum neural networks (QNNs) are known to not behave as Gaussian processes when randomly.
We show that QNNs and their first two derivatives generally form what we call "Wishart processes"
Our unified framework suggests a certain simple operational definition for the "trainability" of a given QNN model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classical neural networks with random initialization famously behave as Gaussian processes in the limit of many neurons, which allows one to completely characterize their training and generalization behavior. No such general understanding exists for quantum neural networks (QNNs), which -- outside of certain special cases -- are known to not behave as Gaussian processes when randomly initialized. We here prove that QNNs and their first two derivatives instead generally form what we call "Wishart processes," where certain algebraic properties of the network determine the hyperparameters of the process. This Wishart process description allows us to, for the first time: give necessary and sufficient conditions for a QNN architecture to have a Gaussian process limit; calculate the full gradient distribution, generalizing previously known barren plateau results; and calculate the local minima distribution of algebraically constrained QNNs. Our unified framework suggests a certain simple operational definition for the "trainability" of a given QNN model using a newly introduced, experimentally accessible quantity we call the "degrees of freedom" of the network architecture.
Related papers
- Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Critical Initialization of Wide and Deep Neural Networks through Partial
Jacobians: General Theory and Applications [6.579523168465526]
We introduce emphpartial Jacobians of a network, defined as derivatives of preactivations in layer $l$ with respect to preactivations in layer $l_0leq l$.
We derive recurrence relations for the norms of partial Jacobians and utilize these relations to analyze criticality of deep fully connected neural networks with LayerNorm and/or residual connections.
arXiv Detail & Related papers (2021-11-23T20:31:42Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Exponentially Many Local Minima in Quantum Neural Networks [9.442139459221785]
Quantum Neural Networks (QNNs) are important quantum applications because of their similar promises as classical neural networks.
We conduct a quantitative investigation on the landscape of loss functions of QNNs and identify a class of simple yet extremely hard QNN instances for training.
We empirically confirm that our constructions can indeed be hard instances in practice with typical gradient-based circuits.
arXiv Detail & Related papers (2021-10-06T03:23:44Z) - Quantum-enhanced neural networks in the neural tangent kernel framework [0.4394730767364254]
We study a class of qcNN composed of a quantum data-encoder followed by a cNN.
In the NTK regime where the number nodes of the cNN becomes infinitely large, the output of the entire qcNN becomes a nonlinear function of the so-called projected quantum kernel.
arXiv Detail & Related papers (2021-09-08T17:16:23Z) - Toward Trainability of Quantum Neural Networks [87.04438831673063]
Quantum Neural Networks (QNNs) have been proposed as generalizations of classical neural networks to achieve the quantum speed-up.
Serious bottlenecks exist for training QNNs due to the vanishing with gradient rate exponential to the input qubit number.
We show that QNNs with tree tensor and step controlled structures for the application of binary classification. Simulations show faster convergent rates and better accuracy compared to QNNs with random structures.
arXiv Detail & Related papers (2020-11-12T08:32:04Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Neural Networks and Quantum Field Theory [0.0]
We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory.
The correspondence relies on the fact that many neural networks are drawn from Gaussian processes.
arXiv Detail & Related papers (2020-08-19T18:00:06Z) - On the learnability of quantum neural networks [132.1981461292324]
We consider the learnability of the quantum neural network (QNN) built on the variational hybrid quantum-classical scheme.
We show that if a concept can be efficiently learned by QNN, then it can also be effectively learned by QNN even with gate noise.
arXiv Detail & Related papers (2020-07-24T06:34:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.