Related papers: Exploring Channel Distinguishability in Local Neighborhoods of the Model Space in Quantum Neural Networks

Exploring Channel Distinguishability in Local Neighborhoods of the Model Space in Quantum Neural Networks

URL: http://arxiv.org/abs/2410.09470v1
Date: Sat, 12 Oct 2024 10:20:26 GMT
Title: Exploring Channel Distinguishability in Local Neighborhoods of the Model Space in Quantum Neural Networks
Authors: Sabrina Herbst, Sandeep Suresh Cranganore, Vincenzo De Maio, Ivona Brandic,
Abstract summary: Quantum Neural Networks (QNNs) have emerged and gained significant attention. QNNs have been shown to be notoriously difficult to train, which we hypothesize is partially due to the architectures, called ansatzes.
Score: 0.5277756703318045
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the increasing interest in Quantum Machine Learning, Quantum Neural Networks (QNNs) have emerged and gained significant attention. These models have, however, been shown to be notoriously difficult to train, which we hypothesize is partially due to the architectures, called ansatzes, that are hardly studied at this point. Therefore, in this paper, we take a step back and analyze ansatzes. We initially consider their expressivity, i.e., the space of operations they are able to express, and show that the closeness to being a 2-design, the primarily used measure, fails at capturing this property. Hence, we look for alternative ways to characterize ansatzes by considering the local neighborhood of the model space, in particular, analyzing model distinguishability upon small perturbation of parameters. We derive an upper bound on their distinguishability, showcasing that QNNs with few parameters are hardly discriminable upon update. Our numerical experiments support our bounds and further indicate that there is a significant degree of variability, which stresses the need for warm-starting or clever initialization. Altogether, our work provides an ansatz-centric perspective on training dynamics and difficulties in QNNs, ultimately suggesting that iterative training of small quantum models may not be effective, which contrasts their initial motivation.

Related papers

Benign or Not-Benign Overfitting in Token Selection of Attention Mechanism [34.316270145027616]
We analyze benign overfitting in the token selection mechanism of the attention architecture. To the best of our knowledge, this is the first study to characterize benign overfitting for the attention mechanism.
arXiv Detail & Related papers (2024-09-26T08:20:05Z)
Coherent Feed Forward Quantum Neural Network [2.1178416840822027]
Quantum machine learning, focusing on quantum neural networks (QNNs), remains a vastly uncharted field of study. We introduce a bona fide QNN model, which seamlessly aligns with the versatility of a traditional FFNN in terms of its adaptable intermediate layers and nodes. We test our proposed model on various benchmarking datasets such as the diagnostic breast cancer (Wisconsin) and credit card fraud detection datasets.
arXiv Detail & Related papers (2024-02-01T15:13:26Z)
On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics. The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z)
Unraveling Feature Extraction Mechanisms in Neural Networks [10.13842157577026]
We propose a theoretical approach based on Neural Tangent Kernels (NTKs) to investigate such mechanisms. We reveal how these models leverage statistical features during gradient descent and how they are integrated into final decisions. We find that while self-attention and CNN models may exhibit limitations in learning n-grams, multiplication-based models seem to excel in this area.
arXiv Detail & Related papers (2023-10-25T04:22:40Z)
Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks [72.09574528342732]
Physics-informed neural network (PINN) algorithms have shown promising results in solving a wide range of problems involving partial differential equations (PDEs) They often fail to converge to desirable solutions when the target function contains high-frequency features, due to a phenomenon known as spectral bias. In the present work, we exploit neural tangent kernels (NTKs) to investigate the training dynamics of PINNs evolving under gradient descent with momentum (SGDM)
arXiv Detail & Related papers (2022-06-29T19:03:10Z)
Toward Physically Realizable Quantum Neural Networks [15.018259942339446]
Current solutions for quantum neural networks (QNNs) pose challenges concerning their scalability. The exponential state space of QNNs poses challenges for the scalability of training procedures. This paper presents a new model for QNNs that relies on band-limited Fourier expansions of transfer functions of quantum perceptrons.
arXiv Detail & Related papers (2022-03-22T23:03:32Z)
Exponentially Many Local Minima in Quantum Neural Networks [9.442139459221785]
Quantum Neural Networks (QNNs) are important quantum applications because of their similar promises as classical neural networks. We conduct a quantitative investigation on the landscape of loss functions of QNNs and identify a class of simple yet extremely hard QNN instances for training. We empirically confirm that our constructions can indeed be hard instances in practice with typical gradient-based circuits.
arXiv Detail & Related papers (2021-10-06T03:23:44Z)
Critical Points in Quantum Generative Models [0.0]
We study the clustering of local minima of the loss function near the global minimum. We give the first proof of this transition in trainability, specializing to this latter class of quantum generative model.
arXiv Detail & Related papers (2021-09-14T20:34:28Z)
Characterizing possible failure modes in physics-informed neural networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs. We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z)
The dilemma of quantum neural networks [63.82713636522488]
We show that quantum neural networks (QNNs) fail to provide any benefit over classical learning models. QNNs suffer from the severely limited effective model capacity, which incurs poor generalization on real-world datasets. These results force us to rethink the role of current QNNs and to design novel protocols for solving real-world problems with quantum advantages.
arXiv Detail & Related papers (2021-06-09T10:41:47Z)
TimeSHAP: Explaining Recurrent Models through Sequence Perturbations [3.1498833540989413]
Recurrent neural networks are a standard building block in numerous machine learning domains. The complex decision-making in these models is seen as a black-box, creating a tension between accuracy and interpretability. In this work, we contribute to filling these gaps by presenting TimeSHAP, a model-agnostic recurrent explainer.
arXiv Detail & Related papers (2020-11-30T19:48:57Z)
Phase Detection with Neural Networks: Interpreting the Black Box [58.720142291102135]
Neural networks (NNs) usually hinder any insight into the reasoning behind their predictions. We demonstrate how influence functions can unravel the black box of NN when trained to predict the phases of the one-dimensional extended spinless Fermi-Hubbard model at half-filling.
arXiv Detail & Related papers (2020-04-09T17:45:45Z)
Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms. We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.