DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
- URL: http://arxiv.org/abs/2403.01326v1
- Date: Sat, 2 Mar 2024 22:16:47 GMT
- Title: DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
- Authors: Guangrun Wang, Changlin Li, Liuchun Yuan, Jiefeng Peng, Xiaoyu Xian,
Xiaodan Liang, Xiaojun Chang, and Liang Lin
- Abstract summary: We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
- Score: 121.05720140641189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Architecture Search (NAS), aiming at automatically designing neural
architectures by machines, has been considered a key step toward automatic
machine learning. One notable NAS branch is the weight-sharing NAS, which
significantly improves search efficiency and allows NAS algorithms to run on
ordinary computers. Despite receiving high expectations, this category of
methods suffers from low search effectiveness. By employing a generalization
boundedness tool, we demonstrate that the devil behind this drawback is the
untrustworthy architecture rating with the oversized search space of the
possible architectures. Addressing this problem, we modularize a large search
space into blocks with small search spaces and develop a family of models with
the distilling neural architecture (DNA) techniques. These proposed models,
namely a DNA family, are capable of resolving multiple dilemmas of the
weight-sharing NAS, such as scalability, efficiency, and multi-modal
compatibility. Our proposed DNA models can rate all architecture candidates, as
opposed to previous works that can only access a sub- search space using
heuristic algorithms. Moreover, under a certain computational complexity
constraint, our method can seek architectures with different depths and widths.
Extensive experimental evaluations show that our models achieve
state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile
convolutional network and a small vision transformer, respectively.
Additionally, we provide in-depth empirical analysis and insights into neural
architecture ratings. Codes available: \url{https://github.com/changlin31/DNA}.
Related papers
- Delta-NAS: Difference of Architecture Encoding for Predictor-based Evolutionary Neural Architecture Search [5.1331676121360985]
We craft an algorithm with the capability to perform fine-grain NAS at a low cost.
We propose projecting the problem to a lower dimensional space through predicting the difference in accuracy of a pair of similar networks.
arXiv Detail & Related papers (2024-11-21T02:43:32Z) - A Pairwise Comparison Relation-assisted Multi-objective Evolutionary Neural Architecture Search Method with Multi-population Mechanism [58.855741970337675]
Neural architecture search (NAS) enables re-searchers to automatically explore vast search spaces and find efficient neural networks.
NAS suffers from a key bottleneck, i.e., numerous architectures need to be evaluated during the search process.
We propose the SMEM-NAS, a pairwise com-parison relation-assisted multi-objective evolutionary algorithm based on a multi-population mechanism.
arXiv Detail & Related papers (2024-07-22T12:46:22Z) - DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit
CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices.
One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS.
We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z) - GeNAS: Neural Architecture Search with Better Generalization [14.92869716323226]
Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data.
In this paper, we investigate a new neural architecture search measure for excavating architectures with better generalization.
arXiv Detail & Related papers (2023-05-15T12:44:54Z) - BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS)
We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately.
We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z) - Hierarchical Neural Architecture Search for Deep Stereo Matching [131.94481111956853]
We propose the first end-to-end hierarchical NAS framework for deep stereo matching.
Our framework incorporates task-specific human knowledge into the neural architecture search framework.
It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
arXiv Detail & Related papers (2020-10-26T11:57:37Z) - DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution
Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning.
In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs.
With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.