Latency-Aware Differentiable Neural Architecture Search
- URL: http://arxiv.org/abs/2001.06392v2
- Date: Thu, 26 Mar 2020 02:20:32 GMT
- Title: Latency-Aware Differentiable Neural Architecture Search
- Authors: Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian,
Hongkai Xiong
- Abstract summary: Differentiable neural architecture search methods became popular in recent years, mainly due to their low search costs and flexibility in designing the search space.
However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware.
This paper deals with this problem by adding a differentiable latency loss term into optimization, so that the search process can tradeoff between accuracy and latency with a balancing coefficient.
- Score: 113.35689580508343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentiable neural architecture search methods became popular in recent
years, mainly due to their low search costs and flexibility in designing the
search space. However, these methods suffer the difficulty in optimizing
network, so that the searched network is often unfriendly to hardware. This
paper deals with this problem by adding a differentiable latency loss term into
optimization, so that the search process can tradeoff between accuracy and
latency with a balancing coefficient. The core of latency prediction is to
encode each network architecture and feed it into a multi-layer regressor, with
the training data which can be easily collected from randomly sampling a number
of architectures and evaluating them on the hardware. We evaluate our approach
on NVIDIA Tesla-P100 GPUs. With 100K sampled architectures (requiring a few
hours), the latency prediction module arrives at a relative error of lower than
10%. Equipped with this module, the search method can reduce the latency by 20%
meanwhile preserving the accuracy. Our approach also enjoys the ability of
being transplanted to a wide range of hardware platforms with very few efforts,
or being used to optimizing other non-differentiable factors such as power
consumption.
Related papers
- On Latency Predictors for Neural Architecture Search [8.564763702766776]
We introduce a comprehensive suite of latency prediction tasks obtained in a principled way through automated partitioning of hardware device sets.
We then design a general latency predictor to comprehensively study (1) the predictor architecture, (2) NN sample selection methods, (3) hardware device representations, and (4) NN operation encoding schemes.
Building on conclusions from our study, we present an end-to-end latency predictor training strategy.
arXiv Detail & Related papers (2024-03-04T19:59:32Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - Latency-aware Spatial-wise Dynamic Networks [33.88843632160247]
We propose a latency-aware spatial-wise dynamic network (LASNet) for deep networks.
LASNet performs coarse-grained spatially adaptive inference under the guidance of a novel latency prediction model.
Experiments on image classification, object detection and instance segmentation demonstrate that the proposed framework significantly improves the practical inference efficiency of deep networks.
arXiv Detail & Related papers (2022-10-12T14:09:27Z) - MAPLE: Microprocessor A Priori for Latency Estimation [81.91509153539566]
Modern deep neural networks must demonstrate state-of-the-art accuracy while exhibiting low latency and energy consumption.
Measuring the latency of every evaluated architecture adds a significant amount of time to the NAS process.
We propose Microprocessor A Priori for Estimation Estimation MAPLE that does not rely on transfer learning or domain adaptation.
arXiv Detail & Related papers (2021-11-30T03:52:15Z) - D-DARTS: Distributed Differentiable Architecture Search [75.12821786565318]
Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods.
We propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level.
arXiv Detail & Related papers (2021-08-20T09:07:01Z) - Towards Improving the Consistency, Efficiency, and Flexibility of
Differentiable Neural Architecture Search [84.4140192638394]
Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation.
In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells.
Our method also spares much memory and computation cost, which speeds up the search process.
arXiv Detail & Related papers (2021-01-27T12:16:47Z) - ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse
Coding [86.40042104698792]
We formulate neural architecture search as a sparse coding problem.
In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search.
Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.
arXiv Detail & Related papers (2020-10-13T04:34:24Z) - DANCE: Differentiable Accelerator/Network Co-Exploration [8.540518473228078]
This work presents a differentiable approach towards the co-exploration of the hardware accelerator and network architecture design.
By modeling the hardware evaluation software with a neural network, the relation between the accelerator architecture and the hardware metrics becomes differentiable.
Compared to the naive existing approaches, our method performs co-exploration in a significantly shorter time, while achieving superior accuracy and hardware cost metrics.
arXiv Detail & Related papers (2020-09-14T07:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.