NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction
- URL: http://arxiv.org/abs/2211.08024v3
- Date: Thu, 23 Mar 2023 03:03:56 GMT
- Title: NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction
- Authors: Yun Yi, Haokui Zhang, Wenze Hu, Nannan Wang, Xiaoyu Wang
- Abstract summary: We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
- Score: 37.357949900603295
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the wide and deep adoption of deep learning models in real applications,
there is an increasing need to model and learn the representations of the
neural networks themselves. These models can be used to estimate attributes of
different neural network architectures such as the accuracy and latency,
without running the actual training or inference tasks. In this paper, we
propose a neural architecture representation model that can be used to estimate
these attributes holistically. Specifically, we first propose a simple and
effective tokenizer to encode both the operation and topology information of a
neural network into a single sequence. Then, we design a multi-stage fusion
transformer to build a compact vector representation from the converted
sequence. For efficient model training, we further propose an information flow
consistency augmentation and correspondingly design an architecture consistency
loss, which brings more benefits with less augmentation samples compared with
previous random augmentation strategies. Experiment results on NAS-Bench-101,
NAS-Bench-201, DARTS search space and NNLQP show that our proposed framework
can be used to predict the aforementioned latency and accuracy attributes of
both cell architectures and whole deep neural networks, and achieves promising
performance. Code is available at https://github.com/yuny220/NAR-Former.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search [10.699485270006601]
We introduce a novel Graph Neural Networks (GNN) predictor for Neural Architecture Search (NAS)
This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views.
The experimental results showcase a significant improvement in prediction accuracy, with a 3%--16% increase in Kendall-tau correlation.
arXiv Detail & Related papers (2024-04-24T03:22:49Z) - NAR-Former V2: Rethinking Transformer for Universal Neural Network
Representation Learning [25.197394237526865]
We propose a modified Transformer-based universal neural network representation learning model NAR-Former V2.
Specifically, we take the network as a graph and design a straightforward tokenizer to encode the network into a sequence.
We incorporate the inductive representation learning capability of GNN into Transformer, enabling Transformer to generalize better when encountering unseen architecture.
arXiv Detail & Related papers (2023-06-19T09:11:04Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - GENNAPE: Towards Generalized Neural Architecture Performance Estimators [25.877126553261434]
GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations.
It first learns a graph encoder via Contrastive Learning to encourage network separation by topological features.
Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferability to 5 different public neural network benchmarks.
arXiv Detail & Related papers (2022-11-30T18:27:41Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Differentiable Neural Architecture Learning for Efficient Neural Network
Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function.
We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z) - Self-supervised Representation Learning for Evolutionary Neural
Architecture Search [9.038625856798227]
Recently proposed neural architecture search (NAS) algorithms adopt neural predictors to accelerate the architecture search.
How to obtain a neural predictor with high prediction accuracy using a small amount of training data is a central problem to neural predictor-based NAS.
We devise two self-supervised learning methods to pre-train the architecture embedding part of neural predictors.
We achieve state-of-the-art performance on the NASBench-101 and NASBench201 benchmarks when integrating the pre-trained neural predictors with an evolutionary NAS algorithm.
arXiv Detail & Related papers (2020-10-31T04:57:16Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.