Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained
Deep Neural Networks
- URL: http://arxiv.org/abs/2206.00302v1
- Date: Wed, 1 Jun 2022 08:04:50 GMT
- Title: Multi-Complexity-Loss DNAS for Energy-Efficient and Memory-Constrained
Deep Neural Networks
- Authors: Matteo Risso, Alessio Burrello, Luca Benini, Enrico Macii, Massimo
Poncino, Daniele Jahier Pagliari
- Abstract summary: Energy and memory are rarely considered simultaneously, in particular by low-search-cost Differentiable (DNAS) solutions.
We propose the first DNAS that directly addresses the most realistic scenario from a designer's perspective.
Our networks span a range of 2.18x in energy consumption and 4.04% in accuracy for the same memory constraint, and reduce energy by up to 2.2x with negligible accuracy drop with respect to the baseline.
- Score: 22.40937602825472
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Architecture Search (NAS) is increasingly popular to automatically
explore the accuracy versus computational complexity trade-off of Deep Learning
(DL) architectures. When targeting tiny edge devices, the main challenge for DL
deployment is matching the tight memory constraints, hence most NAS algorithms
consider model size as the complexity metric. Other methods reduce the energy
or latency of DL models by trading off accuracy and number of inference
operations. Energy and memory are rarely considered simultaneously, in
particular by low-search-cost Differentiable NAS (DNAS) solutions. We overcome
this limitation proposing the first DNAS that directly addresses the most
realistic scenario from a designer's perspective: the co-optimization of
accuracy and energy (or latency) under a memory constraint, determined by the
target HW. We do so by combining two complexity-dependent loss functions during
training, with independent strength. Testing on three edge-relevant tasks from
the MLPerf Tiny benchmark suite, we obtain rich Pareto sets of architectures in
the energy vs. accuracy space, with memory footprints constraints spanning from
75% to 6.25% of the baseline networks. When deployed on a commercial edge
device, the STM NUCLEO-H743ZI2, our networks span a range of 2.18x in energy
consumption and 4.04% in accuracy for the same memory constraint, and reduce
energy by up to 2.2x with negligible accuracy drop with respect to the
baseline.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - Enhancing Neural Architecture Search with Multiple Hardware Constraints
for Deep Learning Model Deployment on Tiny IoT Devices [17.919425885740793]
We propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods.
We show that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively.
arXiv Detail & Related papers (2023-10-11T06:09:14Z) - Lightweight Neural Architecture Search for Temporal Convolutional
Networks at the Edge [21.72253397805102]
This work focuses in particular on Temporal Convolutional Networks (TCNs), a convolutional model for time-series processing.
We propose the first NAS tool that explicitly targets the optimization of the most peculiar architectural parameters of TCNs.
We test the proposed NAS on four real-world, edge-relevant tasks, involving audio and bio-signals.
arXiv Detail & Related papers (2023-01-24T19:47:40Z) - Channel-wise Mixed-precision Assignment for DNN Inference on Constrained
Edge Nodes [22.40937602825472]
State-of-the-art mixed-precision works layer-wise, i.e., it uses different bit-widths for the weights and activations tensors of each network layer.
We propose a novel NAS that selects the bit-width of each weight tensor channel independently.
Our networks reduce the memory and energy for inference by up to 63% and 27% respectively.
arXiv Detail & Related papers (2022-06-17T15:51:49Z) - UDC: Unified DNAS for Compressible TinyML Models [10.67922101024593]
This work bridges the gap between NPU HW capability and NN model design by proposing a neural arcthiecture search (NAS) algorithm.
We demonstrate Unified DNAS for Compressible models (UDC) on CIFAR100, ImageNet, and DIV2K super resolution tasks.
On ImageNet, we find dominant compressible models, which are 1.9x smaller or 5.76% more accurate.
arXiv Detail & Related papers (2022-01-15T12:35:26Z) - MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning [72.80896338009579]
We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs.
We propose a generic patch-by-patch inference scheduling, which significantly cuts down the peak memory.
We automate the process with neural architecture search to jointly optimize the neural architecture and inference scheduling, leading to MCUNetV2.
arXiv Detail & Related papers (2021-10-28T17:58:45Z) - Neural Architecture Search for Efficient Uncalibrated Deep Photometric
Stereo [105.05232615226602]
We leverage differentiable neural architecture search (NAS) strategy to find uncalibrated PS architecture automatically.
Experiments on the DiLiGenT dataset show that the automatically searched neural architectures performance compares favorably with the state-of-the-art uncalibrated PS methods.
arXiv Detail & Related papers (2021-10-11T21:22:17Z) - NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural
Architecture Search [100.71365025972258]
We propose NAS-BERT, an efficient method for BERT compression.
NAS-BERT trains a big supernet on a search space and outputs multiple compressed models with adaptive sizes and latency.
Experiments on GLUE and SQuAD benchmark datasets demonstrate that NAS-BERT can find lightweight models with better accuracy than previous approaches.
arXiv Detail & Related papers (2021-05-30T07:20:27Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.