UDC: Unified DNAS for Compressible TinyML Models
- URL: http://arxiv.org/abs/2201.05842v1
- Date: Sat, 15 Jan 2022 12:35:26 GMT
- Title: UDC: Unified DNAS for Compressible TinyML Models
- Authors: Igor Fedorov, Ramon Matas, Hokchhay Tann, Chuteng Zhou, Matthew
Mattina, Paul Whatmough
- Abstract summary: This work bridges the gap between NPU HW capability and NN model design by proposing a neural arcthiecture search (NAS) algorithm.
We demonstrate Unified DNAS for Compressible models (UDC) on CIFAR100, ImageNet, and DIV2K super resolution tasks.
On ImageNet, we find dominant compressible models, which are 1.9x smaller or 5.76% more accurate.
- Score: 10.67922101024593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Emerging Internet-of-things (IoT) applications are driving deployment of
neural networks (NNs) on heavily constrained low-cost hardware (HW) platforms,
where accuracy is typically limited by memory capacity. To address this TinyML
challenge, new HW platforms like neural processing units (NPUs) have support
for model compression, which exploits aggressive network quantization and
unstructured pruning optimizations. The combination of NPUs with HW compression
and compressible models allows more expressive models in the same memory
footprint.
However, adding optimizations for compressibility on top of conventional NN
architecture choices expands the design space across which we must make
balanced trade-offs. This work bridges the gap between NPU HW capability and NN
model design, by proposing a neural arcthiecture search (NAS) algorithm to
efficiently search a large design space, including: network depth, operator
type, layer width, bitwidth, sparsity, and more. Building on differentiable NAS
(DNAS) with several key improvements, we demonstrate Unified DNAS for
Compressible models (UDC) on CIFAR100, ImageNet, and DIV2K super resolution
tasks. On ImageNet, we find Pareto dominant compressible models, which are 1.9x
smaller or 5.76% more accurate.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient.
We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z) - Enhancing Neural Architecture Search with Multiple Hardware Constraints
for Deep Learning Model Deployment on Tiny IoT Devices [17.919425885740793]
We propose a novel approach to incorporate multiple constraints into so-called Differentiable NAS optimization methods.
We show that, with a single search, it is possible to reduce memory and latency by 87.4% and 54.2%, respectively.
arXiv Detail & Related papers (2023-10-11T06:09:14Z) - Lightweight Neural Architecture Search for Temporal Convolutional
Networks at the Edge [21.72253397805102]
This work focuses in particular on Temporal Convolutional Networks (TCNs), a convolutional model for time-series processing.
We propose the first NAS tool that explicitly targets the optimization of the most peculiar architectural parameters of TCNs.
We test the proposed NAS on four real-world, edge-relevant tasks, involving audio and bio-signals.
arXiv Detail & Related papers (2023-01-24T19:47:40Z) - HQNAS: Auto CNN deployment framework for joint quantization and
architecture search [30.45926484863791]
We propose a novel neural network design framework called Hardware-aware Quantized Neural Architecture Search(HQNAS)
It takes only 4 GPU hours to discover an outstanding NN policy on CIFAR10.
It also takes only %10 GPU time to generate a comparable model on Imagenet.
arXiv Detail & Related papers (2022-10-16T08:32:18Z) - SpiderNet: Hybrid Differentiable-Evolutionary Architecture Search via
Train-Free Metrics [0.0]
Neural Architecture Search (NAS) algorithms are intended to remove the burden of manual neural network design.
NAS algorithms require a variety of design parameters in the form of user configuration or hard-coded decisions which limit the variety of networks that can be discovered.
We present SpiderNet, a hybrid differentiable-evolutionary and hardware-aware algorithm that rapidly and efficiently produces state-of-the-art networks.
arXiv Detail & Related papers (2022-04-20T08:55:01Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural
Architecture Search [100.71365025972258]
We propose NAS-BERT, an efficient method for BERT compression.
NAS-BERT trains a big supernet on a search space and outputs multiple compressed models with adaptive sizes and latency.
Experiments on GLUE and SQuAD benchmark datasets demonstrate that NAS-BERT can find lightweight models with better accuracy than previous approaches.
arXiv Detail & Related papers (2021-05-30T07:20:27Z) - Searching Efficient Model-guided Deep Network for Image Denoising [61.65776576769698]
We present a novel approach by connecting model-guided design with NAS (MoD-NAS)
MoD-NAS employs a highly reusable width search strategy and a densely connected search block to automatically select the operations of each layer.
Experimental results on several popular datasets show that our MoD-NAS has achieved even better PSNR performance than current state-of-the-art methods.
arXiv Detail & Related papers (2021-04-06T14:03:01Z) - Neural Architecture Search of SPD Manifold Networks [79.45110063435617]
We propose a new neural architecture search (NAS) problem of Symmetric Positive Definite (SPD) manifold networks.
We first introduce a geometrically rich and diverse SPD neural architecture search space for an efficient SPD cell design.
We exploit a differentiable NAS algorithm on our relaxed continuous search space for SPD neural architecture search.
arXiv Detail & Related papers (2020-10-27T18:08:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.