Related papers: Reducing Inference Energy Consumption Using Dual Complementary CNNs

Reducing Inference Energy Consumption Using Dual Complementary CNNs

URL: http://arxiv.org/abs/2412.01039v2
Date: Wed, 11 Dec 2024 06:22:34 GMT
Title: Reducing Inference Energy Consumption Using Dual Complementary CNNs
Authors: Michail Kinnas, John Violos, Ioannis Kompatsiaris, Symeon Papadopoulos,
Abstract summary: We propose a novel approach to reduce the energy requirements of inference of CNNs.<n>We employ two small Complementary CNNs that collaborate with each other by covering each other's "weaknesses" in predictions.<n>Our experiments on a Jetson Nano computer demonstrate an energy reduction of up to 85.8% achieved on modified datasets where each sample was duplicated once.
Score: 13.783950035836593
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Energy efficiency of Convolutional Neural Networks (CNNs) has become an important area of research, with various strategies being developed to minimize the power consumption of these models. Previous efforts, including techniques like model pruning, quantization, and hardware optimization, have made significant strides in this direction. However, there remains a need for more effective on device AI solutions that balance energy efficiency with model performance. In this paper, we propose a novel approach to reduce the energy requirements of inference of CNNs. Our methodology employs two small Complementary CNNs that collaborate with each other by covering each other's "weaknesses" in predictions. If the confidence for a prediction of the first CNN is considered low, the second CNN is invoked with the aim of producing a higher confidence prediction. This dual-CNN setup significantly reduces energy consumption compared to using a single large deep CNN. Additionally, we propose a memory component that retains previous classifications for identical inputs, bypassing the need to re-invoke the CNNs for the same input, further saving energy. Our experiments on a Jetson Nano computer demonstrate an energy reduction of up to 85.8% achieved on modified datasets where each sample was duplicated once. These findings indicate that leveraging a complementary CNN pair along with a memory component effectively reduces inference energy while maintaining high accuracy.

Related papers

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve. We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap. This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z)
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient. We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z)
Deep Convolutional Neural Networks for Short-Term Multi-Energy Demand Prediction of Integrated Energy Systems [0.0]
This paper develops six novel prediction models based on Convolutional Neural Networks (CNNs) for forecasting multi-energy power consumptions. The models are applied in a comprehensive manner on a novel integrated electrical, heat and gas network system.
arXiv Detail & Related papers (2023-12-24T14:56:23Z)
An automated approach for improving the inference latency and energy efficiency of pretrained CNNs by removing irrelevant pixels with focused convolutions [0.8706730566331037]
We propose a novel, automated method to make a pretrained CNN more energy-efficient without re-training. Our modified focused convolution operation saves inference latency (by up to 25%) and energy costs (by up to 22%) on various popular pretrained CNNs.
arXiv Detail & Related papers (2023-10-11T18:07:37Z)
The Effects of Partitioning Strategies on Energy Consumption in Distributed CNN Inference at The Edge [0.0]
Many AI applications require Convolutional Neural Network (CNN) inference on a distributed system at the edge. There are four main partitioning strategies that can be utilized to partition a large CNN model and perform distributed CNN inference on multiple devices at the edge. In this paper, we investigate and compare the per-device energy consumption of CNN model inference at the edge on a distributed system when the four partitioning strategies are utilized.
arXiv Detail & Related papers (2022-10-15T22:54:02Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Rescaling CNN through Learnable Repetition of Network Parameters [2.137666194897132]
We present a novel rescaling strategy for CNNs based on learnable repetition of its parameters. We show that small base networks when rescaled, can provide performance comparable to deeper networks with as low as 6% of optimization parameters of the deeper one.
arXiv Detail & Related papers (2021-01-14T15:03:25Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement [53.47564132861866]
We find that a hybrid architecture, namely CNN-TT, is capable of maintaining a good quality performance with a reduced model parameter size. CNN-TT is composed of several convolutional layers at the bottom for feature extraction to improve speech quality.
arXiv Detail & Related papers (2020-07-25T22:21:05Z)
PENNI: Pruned Kernel Sharing for Efficient CNN Inference [41.050335599000036]
State-of-the-art (SOTA) CNNs achieve outstanding performance on various tasks. Their high computation demand and massive number of parameters make it difficult to deploy these SOTA CNNs onto resource-constrained devices. We propose PENNI, a CNN model compression framework that is able to achieve model compactness and hardware efficiency simultaneously.
arXiv Detail & Related papers (2020-05-14T16:57:41Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.