Related papers: Zero-Shot NAS via the Suppression of Local Entropy Decrease

Zero-Shot NAS via the Suppression of Local Entropy Decrease

URL: http://arxiv.org/abs/2411.06236v2
Date: Tue, 12 Nov 2024 08:51:40 GMT
Title: Zero-Shot NAS via the Suppression of Local Entropy Decrease
Authors: Ning Wu, Han Huang, Yueting Xu, Zhifeng Hao,
Abstract summary: Architecture performance evaluation is the most time-consuming part of neural architecture search (NAS) Zero-Shot NAS accelerates the evaluation by utilizing zero-cost proxies instead of training. architectural topologies are used to evaluate the performance of networks in this study.
Score: 21.100745856699277
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Architecture performance evaluation is the most time-consuming part of neural architecture search (NAS). Zero-Shot NAS accelerates the evaluation by utilizing zero-cost proxies instead of training. Though effective, existing zero-cost proxies require invoking backpropagations or running networks on input data, making it difficult to further accelerate the computation of proxies. To alleviate this issue, architecture topologies are used to evaluate the performance of networks in this study. We prove that particular architectural topologies decrease the local entropy of feature maps, which degrades specific features to a bias, thereby reducing network performance. Based on this proof, architectural topologies are utilized to quantify the suppression of local entropy decrease (SED) as a data-free and running-free proxy. Experimental results show that SED outperforms most state-of-the-art proxies in terms of architecture selection on five benchmarks, with computation time reduced by three orders of magnitude. We further compare the SED-based NAS with state-of-the-art proxies. SED-based NAS selects the architecture with higher accuracy and fewer parameters in only one second. The theoretical analyses of local entropy and experimental results demonstrate that the suppression of local entropy decrease facilitates selecting optimal architectures in Zero-Shot NAS.

Related papers

Zero-Shot Neural Architecture Search with Weighted Response Correlation [5.953576590699768]
We present a novel training-free estimation proxy called weighted response correlation (WRCor)<n>WRCor utilizes coefficient correlation matrices of responses across different input samples to calculate the proxy scores of estimated architectures.<n>Our NAS algorithm can discover an architecture with a 22.1% test error on the ImageNet-1k dataset within 4 GPU hours.
arXiv Detail & Related papers (2025-07-08T02:19:29Z)
ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity. This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics. Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z)
TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search [1.30891455653235]
TG-NAS aims to create training-free proxies for architecture performance prediction. We introduce TG-NAS, a novel model-based universal proxy that leverages a transformer-based operator embedding generator and a graph convolution network (GCN) to predict architecture performance. TG-NAS achieves up to 300X improvements in search efficiency compared to previous SOTA ZC proxy methods.
arXiv Detail & Related papers (2024-03-30T07:25:30Z)
AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture Search [30.64117903216323]
Training-free network architecture search (NAS) aims to discover high-performing networks with zero-cost proxies. We propose AZ-NAS, a novel approach that leverages the ensemble of various zero-cost proxies to enhance the correlation between a predicted ranking of networks and the ground truth. Results conclusively demonstrate the efficacy and efficiency of AZ-NAS, outperforming state-of-the-art methods on standard benchmarks.
arXiv Detail & Related papers (2024-03-28T08:44:36Z)
SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss Landscape [14.550053893504764]
We introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS. In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up" We present SiGeo, a proxy founded on a novel theoretical framework that connects the supernet warm-up with the efficacy of the proxy.
arXiv Detail & Related papers (2023-11-22T05:25:24Z)
Extensible Proxy for Efficient NAS [38.124755703499886]
We propose a new approach to design deep neural networks (DNNs) called Neural Architecture Search (NAS) NAS proxies are proposed to address the demanding computational issues of NAS, where each candidate architecture network only requires one iteration of backpropagation. Our experiments confirm the effectiveness of both Eproxy and Eproxy+DPS.
arXiv Detail & Related papers (2022-10-17T22:18:22Z)
BaLeNAS: Differentiable Architecture Search via the Bayesian Learning Rule [95.56873042777316]
Differentiable Architecture Search (DARTS) has received massive attention in recent years, mainly because it significantly reduces the computational cost. This paper formulates the neural architecture search as a distribution learning problem through relaxing the architecture weights into Gaussian distributions. We demonstrate how the differentiable NAS benefits from Bayesian principles, enhancing exploration and improving stability.
arXiv Detail & Related papers (2021-11-25T18:13:42Z)
Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor. We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space. Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z)
AdvantageNAS: Efficient Neural Architecture Search with Credit Assignment [23.988393741948485]
We propose a novel search strategy for one-shot and sparse propagation NAS, namely AdvantageNAS. AdvantageNAS is a gradient-based approach that improves the search efficiency by introducing credit assignment in gradient estimation for architecture updates. Experiments on the NAS-Bench-201 and PTB dataset show that AdvantageNAS discovers an architecture with higher performance under a limited time budget.
arXiv Detail & Related papers (2020-12-11T05:45:03Z)
Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing. An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z)
DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search [76.9225014200746]
Efficient search is a core issue in Neural Architecture Search (NAS) We present DA-NAS that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner. It is 2x faster than previous methods while the accuracy is currently state-of-the-art, at 76.2% under small FLOPs constraint.
arXiv Detail & Related papers (2020-03-27T17:55:21Z)
EcoNAS: Finding Proxies for Economical Neural Architecture Search [130.59673917196994]
In this paper, we observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates. Inspired by these observations, we present a reliable proxy and further formulate a hierarchical proxy strategy. The strategy spends more computations on candidate networks that are potentially more accurate, while discards unpromising ones in early stage with a fast proxy.
arXiv Detail & Related papers (2020-01-05T13:29:02Z)
DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning. In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs. With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.