Related papers: Transferrable Surrogates in Expressive Neural Architecture Search Spaces

Transferrable Surrogates in Expressive Neural Architecture Search Spaces

URL: http://arxiv.org/abs/2504.12971v2
Date: Fri, 18 Apr 2025 17:49:14 GMT
Title: Transferrable Surrogates in Expressive Neural Architecture Search Spaces
Authors: Shiwen Qin, Gabriela Kadlecová, Martin Pilát, Shay B. Cohen, Roman Neruda, Elliot J. Crowley, Jovita Lukasik, Linus Ericsson,
Abstract summary: We investigate surrogate model training for improving search in expressive NAS search spaces based on context-free grammars.<n>We show that i) surrogate models trained either using zero-cost-proxy metrics and neural graph features (GRAF) or by fine-tuning an off-the-shelf LM have high predictive power for the performance of architectures both within and across datasets.
Score: 20.539222762754054
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Neural architecture search (NAS) faces a challenge in balancing the exploration of expressive, broad search spaces that enable architectural innovation with the need for efficient evaluation of architectures to effectively search such spaces. We investigate surrogate model training for improving search in highly expressive NAS search spaces based on context-free grammars. We show that i) surrogate models trained either using zero-cost-proxy metrics and neural graph features (GRAF) or by fine-tuning an off-the-shelf LM have high predictive power for the performance of architectures both within and across datasets, ii) these surrogates can be used to filter out bad architectures when searching on novel datasets, thereby significantly speeding up search and achieving better final performances, and iii) the surrogates can be further used directly as the search objective for huge speed-ups.

Related papers

ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity. This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics. Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z)
Masked Autoencoders Are Robust Neural Architecture Search Learners [14.965550562292476]
We propose a novel NAS framework based on Masked Autoencoders (MAE) that eliminates the need for labeled data during the search process. By replacing the supervised learning objective with an image reconstruction task, our approach enables the robust discovery of network architectures.
arXiv Detail & Related papers (2023-11-20T13:45:21Z)
Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars [66.05096551112932]
We introduce a unifying search space design framework based on context-free grammars. By enhancing and using their properties, we effectively enable search over the complete architecture. We show that our search strategy can be superior to existing Neural Architecture Search approaches.
arXiv Detail & Related papers (2022-11-03T14:23:00Z)
Searching a High-Performance Feature Extractor for Text Recognition Network [92.12492627169108]
We design a domain-specific search space by exploring principles for having good feature extractors. As the space is huge and complexly structured, no existing NAS algorithms can be applied. We propose a two-stage algorithm to effectively search in the space.
arXiv Detail & Related papers (2022-09-27T03:49:04Z)
Search Space Adaptation for Differentiable Neural Architecture Search in Image Classification [15.641353388251465]
Differentiable neural architecture search (NAS) has a great impact by reducing the search cost to the level of training a single network. In this paper, we propose an adaptation scheme of the search space by introducing a search scope. The effectiveness of proposed method is demonstrated with ProxylessNAS for the image classification task.
arXiv Detail & Related papers (2022-06-05T05:27:12Z)
Learning Where To Look -- Generative NAS is Surprisingly Efficient [11.83842808044211]
We propose a generative model, paired with a surrogate predictor, that iteratively learns to generate samples from increasingly promising latent subspaces. This approach leads to very effective and efficient architecture search, while keeping the query amount low.
arXiv Detail & Related papers (2022-03-16T16:27:11Z)
$\beta$-DARTS: Beta-Decay Regularization for Differentiable Architecture Search [85.84110365657455]
We propose a simple-but-efficient regularization method, termed as Beta-Decay, to regularize the DARTS-based NAS searching process. Experimental results on NAS-Bench-201 show that our proposed method can help to stabilize the searching process and makes the searched network more transferable across different datasets.
arXiv Detail & Related papers (2022-03-03T11:47:14Z)
Edge-featured Graph Neural Architecture Search [131.4361207769865]
We propose Edge-featured Graph Neural Architecture Search to find the optimal GNN architecture. Specifically, we design rich entity and edge updating operations to learn high-order representations. We show EGNAS can search better GNNs with higher performance than current state-of-the-art human-designed and searched-based GNNs.
arXiv Detail & Related papers (2021-09-03T07:53:18Z)
Evolving Search Space for Neural Architecture Search [70.71153433676024]
We present a Neural Search-space Evolution (NSE) scheme that amplifies the results from the previous effort by maintaining an optimized search space subset. We achieve 77.3% top-1 retrain accuracy on ImageNet with 333M FLOPs, which yielded a state-of-the-art performance. When the latency constraint is adopted, our result also performs better than the previous best-performing mobile models with a 77.9% Top-1 retrain accuracy.
arXiv Detail & Related papers (2020-11-22T01:11:19Z)
Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels [17.945881805452288]
Current neural architecture search (NAS) strategies focus on finding a single, good, architecture. We propose a Bayesian optimisation approach for NAS that combines the Weisfeiler-Lehman graph kernel with a Gaussian process surrogate. Our method affords interpretability by discovering useful network features and their corresponding impact on the network performance.
arXiv Detail & Related papers (2020-06-13T04:10:34Z)
Neural Architecture Generator Optimization [9.082931889304723]
We are first to investigate casting NAS as a problem of finding the optimal network generator. We propose a new, hierarchical and graph-based search space capable of representing an extremely large variety of network types.
arXiv Detail & Related papers (2020-04-03T06:38:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.