DietCNN: Multiplication-free Inference for Quantized CNNs
- URL: http://arxiv.org/abs/2305.05274v2
- Date: Thu, 17 Aug 2023 13:10:41 GMT
- Title: DietCNN: Multiplication-free Inference for Quantized CNNs
- Authors: Swarnava Dey and Pallab Dasgupta and Partha P Chakrabarti
- Abstract summary: This paper proposes a new method for replacing multiplications in a CNN by table look-ups.
It is shown that the proposed multiplication-free CNN, based on a single activation codebook, can achieve 4.7x, 5.6x, and 3.5x reduction in energy per inference.
- Score: 9.295702629926025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rising demand for networked embedded systems with machine intelligence
has been a catalyst for sustained attempts by the research community to
implement Convolutional Neural Networks (CNN) based inferencing on embedded
resource-limited devices. Redesigning a CNN by removing costly multiplication
operations has already shown promising results in terms of reducing inference
energy usage. This paper proposes a new method for replacing multiplications in
a CNN by table look-ups. Unlike existing methods that completely modify the CNN
operations, the proposed methodology preserves the semantics of the major CNN
operations. Conforming to the existing mechanism of the CNN layer operations
ensures that the reliability of a standard CNN is preserved. It is shown that
the proposed multiplication-free CNN, based on a single activation codebook,
can achieve 4.7x, 5.6x, and 3.5x reduction in energy per inference in an FPGA
implementation of MNIST-LeNet-5, CIFAR10-VGG-11, and Tiny ImageNet-ResNet-18
respectively. Our results show that the DietCNN approach significantly improves
the resource consumption and latency of deep inference for smaller models,
often used in embedded systems. Our code is available at:
https://github.com/swadeykgp/DietCNN
Related papers
- OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve.
We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap.
This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z) - Dynamic Semantic Compression for CNN Inference in Multi-access Edge
Computing: A Graph Reinforcement Learning-based Autoencoder [82.8833476520429]
We propose a novel semantic compression method, autoencoder-based CNN architecture (AECNN) for effective semantic extraction and compression in partial offloading.
In the semantic encoder, we introduce a feature compression module based on the channel attention mechanism in CNNs, to compress intermediate data by selecting the most informative features.
In the semantic decoder, we design a lightweight decoder to reconstruct the intermediate data through learning from the received compressed data to improve accuracy.
arXiv Detail & Related papers (2024-01-19T15:19:47Z) - An Efficient Evolutionary Deep Learning Framework Based on Multi-source
Transfer Learning to Evolve Deep Convolutional Neural Networks [8.40112153818812]
Convolutional neural networks (CNNs) have constantly achieved better performance over years by introducing more complex topology, and enlarging the capacity towards deeper and wider CNNs.
The computational cost is still the bottleneck of automatically designing CNNs.
In this paper, inspired by transfer learning, a new evolutionary computation based framework is proposed to efficiently evolve CNNs.
arXiv Detail & Related papers (2022-12-07T20:22:58Z) - Attention-based Feature Compression for CNN Inference Offloading in Edge
Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems.
We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device.
Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z) - AutoDiCE: Fully Automated Distributed CNN Inference at the Edge [0.9883261192383613]
We propose a novel framework, called AutoDiCE, for automated splitting of a CNN model into a set of sub-models.
Our experimental results show that AutoDiCE can deliver distributed CNN inference with reduced energy consumption and memory usage per edge device.
arXiv Detail & Related papers (2022-07-20T15:08:52Z) - Recursive Least Squares for Training and Pruning Convolutional Neural
Networks [27.089496826735672]
Convolutional neural networks (CNNs) have succeeded in many practical applications.
High computation and storage requirements make them difficult to deploy on resource-constrained devices.
We propose a novel algorithm for training and pruning CNNs.
arXiv Detail & Related papers (2022-01-13T07:14:08Z) - Exploiting Hybrid Models of Tensor-Train Networks for Spoken Command
Recognition [9.262289183808035]
This work aims to design a low complexity spoken command recognition (SCR) system.
We exploit a deep hybrid architecture of a tensor-train (TT) network to build an end-to-end SRC pipeline.
Our proposed CNN+(TT-DNN) model attains a competitive accuracy of 96.31% with 4 times fewer model parameters than the CNN model.
arXiv Detail & Related papers (2022-01-11T05:57:38Z) - Continual 3D Convolutional Neural Networks for Real-time Processing of
Videos [93.73198973454944]
We introduce Continual 3D Contemporalal Neural Networks (Co3D CNNs)
Co3D CNNs process videos frame-by-frame rather than by clip by clip.
We show that Co3D CNNs initialised on the weights from preexisting state-of-the-art video recognition models reduce floating point operations for frame-wise computations by 10.0-12.4x while improving accuracy on Kinetics-400 by 2.3-3.8x.
arXiv Detail & Related papers (2021-05-31T18:30:52Z) - BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by
Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks.
Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z) - MGIC: Multigrid-in-Channels Neural Network Architectures [8.459177309094688]
We present a multigrid-in-channels approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs)
Our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs.
arXiv Detail & Related papers (2020-11-17T11:29:10Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.