ResQuNNs:Towards Enabling Deep Learning in Quantum Convolution Neural Networks
- URL: http://arxiv.org/abs/2402.09146v5
- Date: Mon, 2 Sep 2024 14:38:01 GMT
- Title: ResQuNNs:Towards Enabling Deep Learning in Quantum Convolution Neural Networks
- Authors: Muhammad Kashif, Muhammad Shafique,
- Abstract summary: We present a novel framework for enhancing the performance of Quanvolutional Neural Networks (QuNNs) by introducing trainable quanvolutional layers.
Our research overcomes this limitation by enabling training within these layers, significantly increasing the flexibility and potential of QuNNs.
We propose a novel architecture, Residual Quanvolutional Neural Networks (ResQuNNs), leveraging the concept of residual learning.
- Score: 4.348591076994875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a novel framework for enhancing the performance of Quanvolutional Neural Networks (QuNNs) by introducing trainable quanvolutional layers and addressing the critical challenges associated with them. Traditional quanvolutional layers, although beneficial for feature extraction, have largely been static, offering limited adaptability. Unlike state-of-the-art, our research overcomes this limitation by enabling training within these layers, significantly increasing the flexibility and potential of QuNNs. However, the introduction of multiple trainable quanvolutional layers induces complexities in gradient-based optimization, primarily due to the difficulty in accessing gradients across these layers. To resolve this, we propose a novel architecture, Residual Quanvolutional Neural Networks (ResQuNNs), leveraging the concept of residual learning, which facilitates the flow of gradients by adding skip connections between layers. By inserting residual blocks between quanvolutional layers, we ensure enhanced gradient access throughout the network, leading to improved training performance. Moreover, we provide empirical evidence on the strategic placement of these residual blocks within QuNNs. Through extensive experimentation, we identify an efficient configuration of residual blocks, which enables gradients across all the layers in the network that eventually results in efficient training. Our findings suggest that the precise location of residual blocks plays a crucial role in maximizing the performance gains in QuNNs. Our results mark a substantial step forward in the evolution of quantum deep learning, offering new avenues for both theoretical development and practical quantum computing applications.
Related papers
- Convexity in ReLU Neural Networks: beyond ICNNs? [17.01649106055384]
We show that every convex function implemented by a 1-hidden-layer ReLU network can be expressed by an ICNN with the same architecture.
We also provide a numerical procedure that allows an exact check of convexity for ReLU neural networks with a large number of affine regions.
arXiv Detail & Related papers (2025-01-06T13:53:59Z) - Residual Kolmogorov-Arnold Network for Enhanced Deep Learning [0.5852077003870417]
We introduce RKAN (Residual Kolmogorov-Arnold Network), which could be easily implemented into stages of traditional networks.
Our proposed RKAN module offers consistent improvements over the base models on various well-known benchmark datasets.
arXiv Detail & Related papers (2024-10-07T21:12:32Z) - Sign Gradient Descent-based Neuronal Dynamics: ANN-to-SNN Conversion Beyond ReLU Network [10.760652747217668]
Spiking neural network (SNN) is studied in multidisciplinary domains to simulate neuro-scientific mechanisms.
The lack of discrete theory obstructs the practical application of SNN by limiting its performance and nonlinearity support.
We present a new optimization-theoretic perspective of the discrete dynamics of spiking neurons.
arXiv Detail & Related papers (2024-07-01T02:09:20Z) - ProAct: Progressive Training for Hybrid Clipped Activation Function to Enhance Resilience of DNNs [0.4660328753262075]
State-of-the-art methods offer either neuron-wise or layer-wise clipping activation functions.
Layer-wise clipped activation functions cannot preserve DNNs resilience at high bit error rates.
We propose a hybrid clipped activation function that integrates neuron-wise and layer-by-layer methods.
arXiv Detail & Related papers (2024-06-10T14:31:38Z) - Low-Rank Learning by Design: the Role of Network Architecture and
Activation Linearity in Gradient Rank Collapse [14.817633094318253]
We study how architectural choices and structure of the data effect gradient rank bounds in deep neural networks (DNNs)
Our theoretical analysis provides these bounds for training fully-connected, recurrent, and convolutional neural networks.
We also demonstrate, both theoretically and empirically, how design choices like activation function linearity, bottleneck layer introduction, convolutional stride, and sequence truncation influence these bounds.
arXiv Detail & Related papers (2024-02-09T19:28:02Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors.
LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task.
Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Multiobjective Evolutionary Pruning of Deep Neural Networks with
Transfer Learning for improving their Performance and Robustness [15.29595828816055]
This work proposes MO-EvoPruneDeepTL, a multi-objective evolutionary pruning algorithm.
We use Transfer Learning to adapt the last layers of Deep Neural Networks, by replacing them with sparse layers evolved by a genetic algorithm.
Experiments show that our proposal achieves promising results in all the objectives, and direct relation are presented.
arXiv Detail & Related papers (2023-02-20T19:33:38Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Statistical Mechanics of Deep Linear Neural Networks: The
Back-Propagating Renormalization Group [4.56877715768796]
We study the statistical mechanics of learning in Deep Linear Neural Networks (DLNNs) in which the input-output function of an individual unit is linear.
We solve exactly the network properties following supervised learning using an equilibrium Gibbs distribution in the weight space.
Our numerical simulations reveal that despite the nonlinearity, the predictions of our theory are largely shared by ReLU networks with modest depth.
arXiv Detail & Related papers (2020-12-07T20:08:31Z) - FactorizeNet: Progressive Depth Factorization for Efficient Network
Architecture Exploration Under Quantization Constraints [93.4221402881609]
We introduce a progressive depth factorization strategy for efficient CNN architecture exploration under quantization constraints.
By algorithmically increasing the granularity of depth factorization in a progressive manner, the proposed strategy enables a fine-grained, low-level analysis of layer-wise distributions.
Such a progressive depth factorization strategy also enables efficient identification of the optimal depth-factorized macroarchitecture design.
arXiv Detail & Related papers (2020-11-30T07:12:26Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.