Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks
- URL: http://arxiv.org/abs/2004.08423v2
- Date: Tue, 15 Dec 2020 09:47:03 GMT
- Title: Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks
- Authors: Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu and Qi Tian
- Abstract summary: We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
- Score: 100.14670789581811
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search has attracted wide attentions in both academia and
industry. To accelerate it, researchers proposed weight-sharing methods which
first train a super-network to reuse computation among different operators,
from which exponentially many sub-networks can be sampled and efficiently
evaluated. These methods enjoy great advantages in terms of computational
costs, but the sampled sub-networks are not guaranteed to be estimated
precisely unless an individual training process is taken. This paper owes such
inaccuracy to the inevitable mismatch between assembled network layers, so that
there is a random error term added to each estimation. We alleviate this issue
by training a graph convolutional network to fit the performance of sampled
sub-networks so that the impact of random errors becomes minimal. With this
strategy, we achieve a higher rank correlation coefficient in the selected set
of candidates, which consequently leads to better performance of the final
architecture. In addition, our approach also enjoys the flexibility of being
used under different hardware constraints, since the graph convolutional
network has provided an efficient lookup table of the performance of
architectures in the entire search space.
Related papers
- Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing.
Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets.
It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z) - Bandit Sampling for Multiplex Networks [8.771092194928674]
We propose an algorithm for scalable learning on multiplex networks with a large number of layers.
Online learning algorithm learns how to sample relevant neighboring layers so that only the layers with relevant information are aggregated during training.
We present experimental results on both synthetic and real-world scenarios.
arXiv Detail & Related papers (2022-02-08T03:26:34Z) - CONetV2: Efficient Auto-Channel Size Optimization for CNNs [35.951376988552695]
This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size.
In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network.
We also introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers.
arXiv Detail & Related papers (2021-10-13T16:17:19Z) - RAN-GNNs: breaking the capacity limits of graph neural networks [43.66682619000099]
Graph neural networks have become a staple in problems addressing learning and analysis of data defined over graphs.
Recent works attribute this to the need to consider multiple neighborhood sizes at the same time and adaptively tune them.
We show that employing a randomly-wired architecture can be a more effective way to increase the capacity of the network and obtain richer representations.
arXiv Detail & Related papers (2021-03-29T12:34:36Z) - Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks.
We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z) - Task-Adaptive Neural Network Retrieval with Meta-Contrastive Learning [34.27089256930098]
We propose a novel neural network retrieval method, which retrieves the most optimal pre-trained network for a given task.
We train this framework by meta-learning a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a network.
We validate the efficacy of our method on ten real-world datasets, against existing NAS baselines.
arXiv Detail & Related papers (2021-03-02T06:30:51Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Discretization-Aware Architecture Search [81.35557425784026]
This paper presents discretization-aware architecture search (DAtextsuperscript2S)
The core idea is to push the super-network towards the configuration of desired topology, so that the accuracy loss brought by discretization is largely alleviated.
Experiments on standard image classification benchmarks demonstrate the superiority of our approach.
arXiv Detail & Related papers (2020-07-07T01:18:58Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.