Related papers: Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator

Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator

URL: http://arxiv.org/abs/2002.05022v2
Date: Fri, 6 Mar 2020 10:28:29 GMT
Title: Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator
Authors: Mohamed S. Abdelfattah, {\L}ukasz Dudziak, Thomas Chau, Royson Lee, Hyeji Kim, Nicholas D. Lane
Abstract summary: We automate HW-CNN codesign using NAS by including parameters from both the CNN model and the HW accelerator. We jointly search for the best model-accelerator pair that boosts accuracy and efficiency.
Score: 21.765796576990137
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural architecture search (NAS) has been very successful at outperforming human-designed convolutional neural networks (CNN) in accuracy, and when hardware information is present, latency as well. However, NAS-designed CNNs typically have a complicated topology, therefore, it may be difficult to design a custom hardware (HW) accelerator for such CNNs. We automate HW-CNN codesign using NAS by including parameters from both the CNN model and the HW accelerator, and we jointly search for the best model-accelerator pair that boosts accuracy and efficiency. We call this Codesign-NAS. In this paper we focus on defining the Codesign-NAS multiobjective optimization problem, demonstrating its effectiveness, and exploring different ways of navigating the codesign search space. For CIFAR-10 image classification, we enumerate close to 4 billion model-accelerator pairs, and find the Pareto frontier within that large search space. This allows us to evaluate three different reinforcement-learning-based search strategies. Finally, compared to ResNet on its most optimal HW accelerator from within our HW design space, we improve on CIFAR-100 classification accuracy by 1.3% while simultaneously increasing performance/area by 41% in just~1000 GPU-hours of running Codesign-NAS.

Related papers

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms. Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z)
DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices. One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS. We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z)
RoHNAS: A Neural Architecture Search Framework with Conjoint Optimization for Adversarial Robustness and Hardware Efficiency of Convolutional and Capsule Networks [10.946374356026679]
RoHNAS is a novel framework that jointly optimize for adversarial-robustness and hardware-efficiency of Deep Neural Network (DNN) For reducing the exploration time, RoHNAS analyzes and selects appropriate values of adversarial perturbation for each dataset to employ in the NAS flow.
arXiv Detail & Related papers (2022-10-11T09:14:56Z)
Algorithm and Hardware Co-design for Reconfigurable CNN Accelerator [3.1431240233552007]
Recent advances in algorithm-hardware co-design for deep neural networks (DNNs) have demonstrated their potential in automatically designing neural architectures and hardware designs. However, it is still a challenging optimization problem due to the expensive training cost and the time-consuming hardware implementation. We propose a novel three-phase co-design framework, with the following new features. Our found network and hardware configuration can achieve 2% 6% higher accuracy, 2x 26x smaller latency and 8.5x higher energy efficiency.
arXiv Detail & Related papers (2021-11-24T20:37:50Z)
FLASH: Fast Neural Architecture Search with Hardware Optimization [7.263481020106725]
Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs) This paper proposes FLASH, a very fast NAS methodology that co-optimizes the DNN accuracy and performance on a real hardware platform.
arXiv Detail & Related papers (2021-08-01T23:46:48Z)
RHNAS: Realizable Hardware and Neural Architecture Search [3.5694949627557846]
RHNAS is a method that combines reinforcement learning for hardware optimization with differentiable neural architecture search. RHNAS discovers realizable NN-HW designs with 1.84x lower latency and 1.86x lower energy-delay product (EDP) on ImageNet and 2.81x lower latency and 3.30x lower on CIFAR-10.
arXiv Detail & Related papers (2021-06-17T00:15:42Z)
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS) We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately. We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z)
Searching for Fast Model Families on Datacenter Accelerators [33.28421782921072]
We search for fast and accurate CNN model families for efficient inference on DC accelerators. We propose a latency-aware compound scaling (LACS) method optimizing both accuracy and latency. Our LACS discovers that network depth should grow much faster than image size and network width.
arXiv Detail & Related papers (2021-02-10T18:15:40Z)
PV-NAS: Practical Neural Architecture Search for Video Recognition [83.77236063613579]
Deep neural networks for video tasks is highly customized and the design of such networks requires domain experts and costly trial and error tests. Recent advance in network architecture search has boosted the image recognition performance in a large margin. In this study, we propose a practical solution, namely Practical Video Neural Architecture Search (PV-NAS)
arXiv Detail & Related papers (2020-11-02T08:50:23Z)
Neural Architecture Search of SPD Manifold Networks [79.45110063435617]
We propose a new neural architecture search (NAS) problem of Symmetric Positive Definite (SPD) manifold networks. We first introduce a geometrically rich and diverse SPD neural architecture search space for an efficient SPD cell design. We exploit a differentiable NAS algorithm on our relaxed continuous search space for SPD neural architecture search.
arXiv Detail & Related papers (2020-10-27T18:08:57Z)
DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.