NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language
Processing
- URL: http://arxiv.org/abs/2006.07116v1
- Date: Fri, 12 Jun 2020 12:19:06 GMT
- Title: NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language
Processing
- Authors: Nikita Klyuchnikov, Ilya Trofimov, Ekaterina Artemova, Mikhail
Salnikov, Maxim Fedorov, Evgeny Burnaev
- Abstract summary: We step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP)
We have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it.
We have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation.
- Score: 12.02718579660613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Architecture Search (NAS) is a promising and rapidly evolving research
area. Training a large number of neural networks requires an exceptional amount
of computational power, which makes NAS unreachable for those researchers who
have limited or no access to high-performance clusters and supercomputers. A
few benchmarks with precomputed neural architectures performances have been
recently introduced to overcome this problem and ensure more reproducible
experiments. However, these benchmarks are only for the computer vision domain
and, thus, are built from the image datasets and convolution-derived
architectures. In this work, we step outside the computer vision domain by
leveraging the language modeling task, which is the core of natural language
processing (NLP). Our main contribution is as follows: we have provided search
space of recurrent neural networks on the text datasets and trained 14k
architectures within it; we have conducted both intrinsic and extrinsic
evaluation of the trained models using datasets for semantic relatedness and
language understanding evaluation; finally, we have tested several NAS
algorithms to demonstrate how the precomputed results can be utilized. We
believe that our results have high potential of usage for both NAS and NLP
communities.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - Efficacy of Neural Prediction-Based Zero-Shot NAS [0.04096453902709291]
We propose a novel approach for zero-shot Neural Architecture Search (NAS) using deep learning.
Our method employs Fourier sum of sines encoding for convolutional kernels, enabling the construction of a computational feed-forward graph with a structure similar to the architecture under evaluation.
Experimental results show that our approach surpasses previous methods using graph convolutional networks in terms of correlation on the NAS-Bench-201 dataset and exhibits a higher convergence rate.
arXiv Detail & Related papers (2023-08-31T14:54:06Z) - GeNAS: Neural Architecture Search with Better Generalization [14.92869716323226]
Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data.
In this paper, we investigate a new neural architecture search measure for excavating architectures with better generalization.
arXiv Detail & Related papers (2023-05-15T12:44:54Z) - Generalization Properties of NAS under Activation and Skip Connection
Search [66.8386847112332]
We study the generalization properties of Neural Architecture Search (NAS) under a unifying framework.
We derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime.
We show how the derived results can guide NAS to select the top-performing architectures, even in the case without training.
arXiv Detail & Related papers (2022-09-15T12:11:41Z) - UnrealNAS: Can We Search Neural Architectures with Unreal Data? [84.78460976605425]
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs)
Previous work has analyzed the necessity of having ground-truth labels in NAS and inspired broad interest.
We take a further step to question whether real data is necessary for NAS to be effective.
arXiv Detail & Related papers (2022-05-04T16:30:26Z) - Accelerating Neural Architecture Exploration Across Modalities Using
Genetic Algorithms [5.620334754517149]
We show how genetic algorithms can be paired with lightly trained objective predictors in an iterative cycle to accelerate multi-objective architectural exploration.
NAS research efforts have centered around computer vision tasks and only recently have other modalities, such as the rapidly growing field of natural language processing, been investigated in depth.
arXiv Detail & Related papers (2022-02-25T20:01:36Z) - Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering.
neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually.
NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z) - Generic Neural Architecture Search via Regression [27.78105839644199]
We propose a novel and generic neural architecture search (NAS) framework, termed Generic NAS (GenNAS)
GenNAS does not use task-specific labels but instead adopts textitregression on a set of manually designed synthetic signal bases for architecture evaluation.
We then propose an automatic task search to optimize the combination of synthetic signals using limited downstream-task-specific labels.
arXiv Detail & Related papers (2021-08-04T08:21:12Z) - Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z) - Hierarchical Neural Architecture Search for Deep Stereo Matching [131.94481111956853]
We propose the first end-to-end hierarchical NAS framework for deep stereo matching.
Our framework incorporates task-specific human knowledge into the neural architecture search framework.
It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
arXiv Detail & Related papers (2020-10-26T11:57:37Z) - Neural Architecture Performance Prediction Using Graph Neural Networks [17.224223176258334]
We propose a surrogate model for neural architecture performance prediction built upon Graph Neural Networks (GNN)
We demonstrate the effectiveness of this surrogate model on neural architecture performance prediction for structurally unknown architectures.
arXiv Detail & Related papers (2020-10-19T09:33:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.