Weight-Entanglement Meets Gradient-Based Neural Architecture Search
- URL: http://arxiv.org/abs/2312.10440v1
- Date: Sat, 16 Dec 2023 13:15:44 GMT
- Title: Weight-Entanglement Meets Gradient-Based Neural Architecture Search
- Authors: Rhea Sanjay Sukthanker, Arjun Krishnakumar, Mahmoud Safari, Frank
Hutter
- Abstract summary: Weight sharing is a fundamental concept in neural architecture search (NAS)
Weight emphentanglement has emerged as a technique for intricate parameter sharing among architectures within macro-level search spaces.
Blackbox optimization methods have been commonly employed, particularly in conjunction with supernet training, to maintain search efficiency.
This paper proposes a novel scheme to adapt gradient-based methods for weight-entangled spaces.
- Score: 44.655931666517645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Weight sharing is a fundamental concept in neural architecture search (NAS),
enabling gradient-based methods to explore cell-based architecture spaces
significantly faster than traditional blackbox approaches. In parallel, weight
\emph{entanglement} has emerged as a technique for intricate parameter sharing
among architectures within macro-level search spaces. %However, the macro
structure of such spaces poses compatibility challenges for gradient-based NAS
methods. %As a result, blackbox optimization methods have been commonly
employed, particularly in conjunction with supernet training, to maintain
search efficiency. %Due to the inherent differences in the structure of these
search spaces, these Since weight-entanglement poses compatibility challenges
for gradient-based NAS methods, these two paradigms have largely developed
independently in parallel sub-communities. This paper aims to bridge the gap
between these sub-communities by proposing a novel scheme to adapt
gradient-based methods for weight-entangled spaces. This enables us to conduct
an in-depth comparative assessment and analysis of the performance of
gradient-based NAS in weight-entangled search spaces. Our findings reveal that
this integration of weight-entanglement and gradient-based NAS brings forth the
various benefits of gradient-based methods (enhanced performance, improved
supernet training properties and superior any-time performance), while
preserving the memory efficiency of weight-entangled spaces. The code for our
work is openly accessible
\href{https://anonymous.4open.science/r/TangleNAS-527C}{here}
Related papers
- Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - iDARTS: Differentiable Architecture Search with Stochastic Implicit
Gradients [75.41173109807735]
Differentiable ARchiTecture Search (DARTS) has recently become the mainstream of neural architecture search (NAS)
We tackle the hypergradient computation in DARTS based on the implicit function theorem.
We show that the architecture optimisation with the proposed method, named iDARTS, is expected to converge to a stationary point.
arXiv Detail & Related papers (2021-06-21T00:44:11Z) - Landmark Regularization: Ranking Guided Super-Net Training in Neural
Architecture Search [70.57382341642418]
Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware.
Recent works have empirically shown a ranking disorder between the performance of stand-alone architectures and that of the corresponding shared-weight networks.
We propose a regularization term that aims to maximize the correlation between the performance rankings of the shared-weight network and that of the standalone architectures.
arXiv Detail & Related papers (2021-04-12T09:32:33Z) - Smooth Variational Graph Embeddings for Efficient Neural Architecture
Search [41.62970837629573]
We propose a two-sided variational graph autoencoder, which allows to smoothly encode and accurately reconstruct neural architectures from various search spaces.
We evaluate the proposed approach on neural architectures defined by the ENAS approach, the NAS-Bench-101 and the NAS-Bench-201 search spaces.
arXiv Detail & Related papers (2020-10-09T17:05:41Z) - Weight-Sharing Neural Architecture Search: A Battle to Shrink the
Optimization Gap [90.93522795555724]
Neural architecture search (NAS) has attracted increasing attentions in both academia and industry.
Weight-sharing methods were proposed in which exponentially many architectures share weights in the same super-network.
This paper provides a literature review on NAS, in particular the weight-sharing methods.
arXiv Detail & Related papers (2020-08-04T11:57:03Z) - DrNAS: Dirichlet Neural Architecture Search [88.56953713817545]
We treat the continuously relaxed architecture mixing weight as random variables, modeled by Dirichlet distribution.
With recently developed pathwise derivatives, the Dirichlet parameters can be easily optimized with gradient-based generalization.
To alleviate the large memory consumption of differentiable NAS, we propose a simple yet effective progressive learning scheme.
arXiv Detail & Related papers (2020-06-18T08:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.