QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models
- URL: http://arxiv.org/abs/2410.10318v1
- Date: Mon, 14 Oct 2024 09:24:48 GMT
- Title: QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models
- Authors: Zhumazhan Balapanov, Edward Magongo, Vanessa Matvei, Olivia Holmberg, Jonathan Pei, Kevin Zhu,
- Abstract summary: Convolutional neural networks (CNNs) have made significant advances in computer vision tasks, yet their high inference times and latency limit real-world applicability.
We introduce QIANets: a novel approach of redesigning the traditional GoogLeNet, DenseNet, and ResNet-18 model architectures to process more parameters and computations whilst maintaining low inference times.
Despite experimental limitations, the method was tested and evaluated, demonstrating reductions in inference times, along with effective accuracy preservations.
- Score: 2.6663666678221376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional neural networks (CNNs) have made significant advances in computer vision tasks, yet their high inference times and latency often limit real-world applicability. While model compression techniques have gained popularity as solutions, they often overlook the critical balance between low latency and uncompromised accuracy. By harnessing quantum-inspired pruning, tensor decomposition, and annealing-based matrix factorization - three quantum-inspired concepts - we introduce QIANets: a novel approach of redesigning the traditional GoogLeNet, DenseNet, and ResNet-18 model architectures to process more parameters and computations whilst maintaining low inference times. Despite experimental limitations, the method was tested and evaluated, demonstrating reductions in inference times, along with effective accuracy preservations.
Related papers
- CTRQNets & LQNets: Continuous Time Recurrent and Liquid Quantum Neural Networks [76.53016529061821]
Liquid Quantum Neural Network (LQNet) and Continuous Time Recurrent Quantum Neural Network (CTRQNet) developed.
LQNet and CTRQNet achieve accuracy increases as high as 40% on CIFAR 10 through binary classification.
arXiv Detail & Related papers (2024-08-28T00:56:03Z) - Learning to Program Variational Quantum Circuits with Fast Weights [3.6881738506505988]
This paper introduces the Quantum Fast Weight Programmers (QFWP) as a solution to the temporal or sequential learning challenge.
The proposed QFWP model achieves learning of temporal dependencies without necessitating the use of quantum recurrent neural networks.
Numerical simulations conducted in this study showcase the efficacy of the proposed QFWP model in both time-series prediction and RL tasks.
arXiv Detail & Related papers (2024-02-27T18:53:18Z) - RefreshNet: Learning Multiscale Dynamics through Hierarchical Refreshing [0.0]
"refreshing" mechanism in RefreshNet allows coarser blocks to reset inputs of finer blocks, effectively controlling and alleviating error accumulation.
"refreshing" mechanism in RefreshNet allows coarser blocks to reset inputs of finer blocks, effectively controlling and alleviating error accumulation.
arXiv Detail & Related papers (2024-01-24T07:47:01Z) - Accelerating Deep Neural Networks via Semi-Structured Activation
Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency.
We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications.
Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z) - Quantum Neural Network for Quantum Neural Computing [0.0]
We propose a new quantum neural network model for quantum neural computing.
Our model circumvents the problem that the state-space size grows exponentially with the number of neurons.
We benchmark our model for handwritten digit recognition and other nonlinear classification tasks.
arXiv Detail & Related papers (2023-05-15T11:16:47Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - QuanGCN: Noise-Adaptive Training for Robust Quantum Graph Convolutional
Networks [124.7972093110732]
We propose quantum graph convolutional networks (QuanGCN), which learns the local message passing among nodes with the sequence of crossing-gate quantum operations.
To mitigate the inherent noises from modern quantum devices, we apply sparse constraint to sparsify the nodes' connections.
Our QuanGCN is functionally comparable or even superior than the classical algorithms on several benchmark graph datasets.
arXiv Detail & Related papers (2022-11-09T21:43:16Z) - The dilemma of quantum neural networks [63.82713636522488]
We show that quantum neural networks (QNNs) fail to provide any benefit over classical learning models.
QNNs suffer from the severely limited effective model capacity, which incurs poor generalization on real-world datasets.
These results force us to rethink the role of current QNNs and to design novel protocols for solving real-world problems with quantum advantages.
arXiv Detail & Related papers (2021-06-09T10:41:47Z) - A Survey of Quantization Methods for Efficient Neural Network Inference [75.55159744950859]
quantization is the problem of distributing continuous real-valued numbers over a fixed discrete set of numbers to minimize the number of bits required.
It has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas.
Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x.
arXiv Detail & Related papers (2021-03-25T06:57:11Z) - Accelerating Deep Learning Inference via Learned Caches [11.617579969991294]
Deep Neural Networks (DNNs) are witnessing increased adoption in multiple domains owing to their high accuracy in solving real-world problems.
Current low latency solutions trade-off on accuracy or fail to exploit the inherent temporal locality in prediction serving workloads.
We present the design of GATI, an end-to-end prediction serving system that incorporates learned caches for low-latency inference.
arXiv Detail & Related papers (2021-01-18T22:13:08Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.