Related papers: Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

URL: http://arxiv.org/abs/2004.08423v2
Date: Tue, 15 Dec 2020 09:47:03 GMT
Title: Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks
Authors: Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu and Qi Tian
Abstract summary: We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
Score: 100.14670789581811
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from which exponentially many sub-networks can be sampled and efficiently evaluated. These methods enjoy great advantages in terms of computational costs, but the sampled sub-networks are not guaranteed to be estimated precisely unless an individual training process is taken. This paper owes such inaccuracy to the inevitable mismatch between assembled network layers, so that there is a random error term added to each estimation. We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates, which consequently leads to better performance of the final architecture. In addition, our approach also enjoys the flexibility of being used under different hardware constraints, since the graph convolutional network has provided an efficient lookup table of the performance of architectures in the entire search space.

Related papers

Optimizing Decentralized Online Learning for Supervised Regression and Classification Problems [0.0]
Decentralized learning networks aim to synthesize a single network inference from a set of raw inferences provided by multiple participants. Despite the increased prevalence of decentralized learning networks, there exists no systematic study that performs a calibration of the associated free parameters. Here we present an optimization framework for key parameters governing decentralized online learning in supervised regression and classification problems.
arXiv Detail & Related papers (2025-01-27T21:36:54Z)
Towards Mitigating Architecture Overfitting on Distilled Datasets [2.3371504588528635]
This paper introduces a series of approaches to mitigate the issue of textitarchitecture overfitting. Specifically, DropPath renders the large model to be an implicit ensemble of its sub-networks, and knowledge distillation ensures each sub-network acts similarly to the small but well-performing teacher network. Our approaches achieve comparable or even superior performance when the test network is larger than the training network.
arXiv Detail & Related papers (2023-09-08T08:12:29Z)
Generalizing Few-Shot NAS with Gradient Matching [165.5690495295074]
One-Shot methods train one supernet to approximate the performance of every architecture in the search space via weight-sharing. Few-Shot NAS reduces the level of weight-sharing by splitting the One-Shot supernet into multiple separated sub-supernets. It significantly outperforms its Few-Shot counterparts while surpassing previous comparable methods in terms of the accuracy of derived architectures.
arXiv Detail & Related papers (2022-03-29T03:06:16Z)
Bandit Sampling for Multiplex Networks [8.771092194928674]
We propose an algorithm for scalable learning on multiplex networks with a large number of layers. Online learning algorithm learns how to sample relevant neighboring layers so that only the layers with relevant information are aggregated during training. We present experimental results on both synthetic and real-world scenarios.
arXiv Detail & Related papers (2022-02-08T03:26:34Z)
CONetV2: Efficient Auto-Channel Size Optimization for CNNs [35.951376988552695]
This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size. In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network. We also introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers.
arXiv Detail & Related papers (2021-10-13T16:17:19Z)
RAN-GNNs: breaking the capacity limits of graph neural networks [43.66682619000099]
Graph neural networks have become a staple in problems addressing learning and analysis of data defined over graphs. Recent works attribute this to the need to consider multiple neighborhood sizes at the same time and adaptively tune them. We show that employing a randomly-wired architecture can be a more effective way to increase the capacity of the network and obtain richer representations.
arXiv Detail & Related papers (2021-03-29T12:34:36Z)
Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks. We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z)
Task-Adaptive Neural Network Retrieval with Meta-Contrastive Learning [34.27089256930098]
We propose a novel neural network retrieval method, which retrieves the most optimal pre-trained network for a given task. We train this framework by meta-learning a cross-modal latent space with contrastive loss, to maximize the similarity between a dataset and a network. We validate the efficacy of our method on ten real-world datasets, against existing NAS baselines.
arXiv Detail & Related papers (2021-03-02T06:30:51Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Discretization-Aware Architecture Search [81.35557425784026]
This paper presents discretization-aware architecture search (DAtextsuperscript2S) The core idea is to push the super-network towards the configuration of desired topology, so that the accuracy loss brought by discretization is largely alleviated. Experiments on standard image classification benchmarks demonstrate the superiority of our approach.
arXiv Detail & Related papers (2020-07-07T01:18:58Z)
DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures. We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.