Improving Differentiable Architecture Search with a Generative Model
- URL: http://arxiv.org/abs/2112.00171v1
- Date: Tue, 30 Nov 2021 23:28:02 GMT
- Title: Improving Differentiable Architecture Search with a Generative Model
- Authors: Ruisi Zhang, Youwei Liang, Sai Ashish Somayajula, Pengtao Xie
- Abstract summary: We introduce a training strategy called Differentiable Architecture Search with a Generative Model(DASGM)"
In DASGM, the training set is used to update the classification model weight, while a synthesized dataset is used to train its architecture.
The generated images have different distributions from the training set, which can help the classification model learn better features to identify its weakness.
- Score: 10.618008515822483
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In differentiable neural architecture search (NAS) algorithms like DARTS, the
training set used to update model weight and the validation set used to update
model architectures are sampled from the same data distribution. Thus, the
uncommon features in the dataset fail to receive enough attention during
training. In this paper, instead of introducing more complex NAS algorithms, we
explore the idea that adding quality synthesized datasets into training can
help the classification model identify its weakness and improve recognition
accuracy. We introduce a training strategy called ``Differentiable Architecture
Search with a Generative Model(DASGM)." In DASGM, the training set is used to
update the classification model weight, while a synthesized dataset is used to
train its architecture. The generated images have different distributions from
the training set, which can help the classification model learn better features
to identify its weakness. We formulate DASGM into a multi-level optimization
framework and develop an effective algorithm to solve it. Experiments on
CIFAR-10, CIFAR-100, and ImageNet have demonstrated the effectiveness of DASGM.
Code will be made available.
Related papers
- Neural Architecture Search using Particle Swarm and Ant Colony
Optimization [0.0]
This paper focuses on training and optimizing CNNs using the Swarm Intelligence (SI) components of OpenNAS.
A system integrating open source tools for Neural Architecture Search (OpenNAS), in the classification of images, has been developed.
arXiv Detail & Related papers (2024-03-06T15:23:26Z) - POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image
and Time Series Classification [8.190723030003804]
This article presents the third version of a sequential model-based NAS algorithm targeting different hardware environments and multiple classification tasks.
Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks.
The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
arXiv Detail & Related papers (2022-12-13T17:14:14Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task.
Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - Understanding Dynamics of Nonlinear Representation Learning and Its
Application [12.697842097171119]
We study the dynamics of implicit nonlinear representation learning.
We show that the data-architecture alignment condition is sufficient for the global convergence.
We derive a new training framework, which satisfies the data-architecture alignment condition without assuming it.
arXiv Detail & Related papers (2021-06-28T16:31:30Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Sparse Signal Models for Data Augmentation in Deep Learning ATR [0.8999056386710496]
We propose a data augmentation approach to incorporate domain knowledge and improve the generalization power of a data-intensive learning algorithm.
We exploit the sparsity of the scattering centers in the spatial domain and the smoothly-varying structure of the scattering coefficients in the azimuthal domain to solve the ill-posed problem of over-parametrized model fitting.
arXiv Detail & Related papers (2020-12-16T21:46:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.