Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks
- URL: http://arxiv.org/abs/2004.08423v2
- Date: Tue, 15 Dec 2020 09:47:03 GMT
- Title: Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks
- Authors: Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu and Qi Tian
- Abstract summary: We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
- Score: 100.14670789581811
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search has attracted wide attentions in both academia and
industry. To accelerate it, researchers proposed weight-sharing methods which
first train a super-network to reuse computation among different operators,
from which exponentially many sub-networks can be sampled and efficiently
evaluated. These methods enjoy great advantages in terms of computational
costs, but the sampled sub-networks are not guaranteed to be estimated
precisely unless an individual training process is taken. This paper owes such
inaccuracy to the inevitable mismatch between assembled network layers, so that
there is a random error term added to each estimation. We alleviate this issue
by training a graph convolutional network to fit the performance of sampled
sub-networks so that the impact of random errors becomes minimal. With this
strategy, we achieve a higher rank correlation coefficient in the selected set
of candidates, which consequently leads to better performance of the final
architecture. In addition, our approach also enjoys the flexibility of being
used under different hardware constraints, since the graph convolutional
network has provided an efficient lookup table of the performance of
architectures in the entire search space.
Related papers
- Optimizing Decentralized Online Learning for Supervised Regression and Classification Problems [0.0]
Decentralized learning networks aim to synthesize a single network inference from a set of raw inferences provided by multiple participants.
Despite the increased prevalence of decentralized learning networks, there exists no systematic study that performs a calibration of the associated free parameters.
Here we present an optimization framework for key parameters governing decentralized online learning in supervised regression and classification problems.
arXiv Detail & Related papers (2025-01-27T21:36:54Z) - Towards Mitigating Architecture Overfitting on Distilled Datasets [2.3371504588528635]
This paper introduces a series of approaches to mitigate the issue of textitarchitecture overfitting.
Specifically, DropPath renders the large model to be an implicit ensemble of its sub-networks, and knowledge distillation ensures each sub-network acts similarly to the small but well-performing teacher network.
Our approaches achieve comparable or even superior performance when the test network is larger than the training network.
arXiv Detail & Related papers (2023-09-08T08:12:29Z) - Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - CONetV2: Efficient Auto-Channel Size Optimization for CNNs [35.951376988552695]
This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size.
In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network.
We also introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers.
arXiv Detail & Related papers (2021-10-13T16:17:19Z) - RAN-GNNs: breaking the capacity limits of graph neural networks [43.66682619000099]
Graph neural networks have become a staple in problems addressing learning and analysis of data defined over graphs.
Recent works attribute this to the need to consider multiple neighborhood sizes at the same time and adaptively tune them.
We show that employing a randomly-wired architecture can be a more effective way to increase the capacity of the network and obtain richer representations.
arXiv Detail & Related papers (2021-03-29T12:34:36Z) - Task-Adaptive Neural Network Retrieval with Meta-Contrastive Learning [34.27089256930098]
We propose a novel neural network retrieval method, which retrieves the most optimal pre-trained network for a given task.
We train this framework by meta-learning a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a network.
We validate the efficacy of our method on ten real-world datasets, against existing NAS baselines.
arXiv Detail & Related papers (2021-03-02T06:30:51Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Discretization-Aware Architecture Search [81.35557425784026]
This paper presents discretization-aware architecture search (DAtextsuperscript2S)
The core idea is to push the super-network towards the configuration of desired topology, so that the accuracy loss brought by discretization is largely alleviated.
Experiments on standard image classification benchmarks demonstrate the superiority of our approach.
arXiv Detail & Related papers (2020-07-07T01:18:58Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.