Towards Self-supervised and Weight-preserving Neural Architecture Search
- URL: http://arxiv.org/abs/2206.04125v1
- Date: Wed, 8 Jun 2022 18:48:05 GMT
- Title: Towards Self-supervised and Weight-preserving Neural Architecture Search
- Authors: Zhuowei Li, Yibo Gao, Zhenzhou Zha, Zhiqiang HU, Qing Xia, Shaoting
Zhang, Dimitris N. Metaxas
- Abstract summary: We propose the self-supervised and weight-preserving neural architecture search (SSWP-NAS) as an extension of the current NAS framework.
Experiments show that the architectures searched by the proposed framework achieve state-of-the-art accuracy on CIFAR-10, CIFAR-100, and ImageNet datasets.
- Score: 38.497608743382145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search (NAS) algorithms save tremendous labor from human
experts. Recent advancements further reduce the computational overhead to an
affordable level. However, it is still cumbersome to deploy the NAS techniques
in real-world applications due to the fussy procedures and the supervised
learning paradigm. In this work, we propose the self-supervised and
weight-preserving neural architecture search (SSWP-NAS) as an extension of the
current NAS framework by allowing the self-supervision and retaining the
concomitant weights discovered during the search stage. As such, we simplify
the workflow of NAS to a one-stage and proxy-free procedure. Experiments show
that the architectures searched by the proposed framework achieve
state-of-the-art accuracy on CIFAR-10, CIFAR-100, and ImageNet datasets without
using manual labels. Moreover, we show that employing the concomitant weights
as initialization consistently outperforms the random initialization and the
two-stage weight pre-training method by a clear margin under semi-supervised
learning scenarios. Codes are publicly available at
https://github.com/LzVv123456/SSWP-NAS.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - SiGeo: Sub-One-Shot NAS via Information Theory and Geometry of Loss
Landscape [14.550053893504764]
We introduce a "sub-one-shot" paradigm that serves as a bridge between zero-shot and one-shot NAS.
In sub-one-shot NAS, the supernet is trained using only a small subset of the training data, a phase we refer to as "warm-up"
We present SiGeo, a proxy founded on a novel theoretical framework that connects the supernet warm-up with the efficacy of the proxy.
arXiv Detail & Related papers (2023-11-22T05:25:24Z) - DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit
CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices.
One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS.
We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z) - NASiam: Efficient Representation Learning using Neural Architecture
Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL)
NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair)
NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search [34.06028035262884]
We propose a novel evolutionary-based NAS strategy, Predictor-assisted E-NAS (PRE-NAS)
PRE-NAS leverages new evolutionary search strategies and integrates high-fidelity weight inheritance over generations.
Experiments on NAS-Bench-201 and DARTS search spaces show that PRE-NAS can outperform state-of-the-art NAS methods.
arXiv Detail & Related papers (2022-04-27T06:40:39Z) - AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision
of Weight Sharing [6.171090327531059]
We introduce Learning to Rank methods to select the best (ace) architectures from a space.
We also propose to leverage weak supervision from weight sharing by pretraining architecture representation on weak labels obtained from the super-net.
Experiments on NAS benchmarks and large-scale search spaces demonstrate that our approach outperforms SOTA with a significantly reduced search cost.
arXiv Detail & Related papers (2021-08-06T08:31:42Z) - Pretraining Neural Architecture Search Controllers with Locality-based
Self-Supervised Learning [0.0]
We propose a pretraining scheme that can be applied to controller-based NAS.
Our method, locality-based self-supervised classification task, leverages the structural similarity of network architectures to obtain good architecture representations.
arXiv Detail & Related papers (2021-03-15T06:30:36Z) - CP-NAS: Child-Parent Neural Architecture Search for Binary Neural
Networks [27.867108193391633]
We propose a 1-bit convolutional neural network (CNN) to reduce the computation and memory cost of Neural Architecture Search (NAS)
A Child-Parent (CP) model is introduced to a differentiable NAS to search the binarized architecture (Child) under the supervision of a full-precision model (Parent)
It achieves the accuracy of $95.27%$ on CIFAR-10, $64.3%$ on ImageNet with binarized weights and activations, and a $30%$ faster search than prior arts.
arXiv Detail & Related papers (2020-04-30T19:09:55Z) - Angle-based Search Space Shrinking for Neural Architecture Search [78.49722661000442]
Angle-Based search space Shrinking (ABS) for Neural Architecture Search (NAS)
Our approach progressively simplifies the original search space by dropping unpromising candidates.
ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.
arXiv Detail & Related papers (2020-04-28T11:26:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.