Surprisingly Strong Performance Prediction with Neural Graph Features
- URL: http://arxiv.org/abs/2404.16551v2
- Date: Tue, 13 Aug 2024 09:42:34 GMT
- Title: Surprisingly Strong Performance Prediction with Neural Graph Features
- Authors: Gabriela Kadlecová, Jovita Lukasik, Martin Pilát, Petra Vidnerová, Mahmoud Safari, Roman Neruda, Frank Hutter,
- Abstract summary: We propose neural graph features (GRAF) to compute properties of architectural graphs.
GRAF offers fast and interpretable performance prediction while outperforming zero-cost proxies.
In combination with other zero-cost proxies, GRAF outperforms most existing performance predictors at a fraction of the cost.
- Score: 35.54664728425731
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Performance prediction has been a key part of the neural architecture search (NAS) process, allowing to speed up NAS algorithms by avoiding resource-consuming network training. Although many performance predictors correlate well with ground truth performance, they require training data in the form of trained networks. Recently, zero-cost proxies have been proposed as an efficient method to estimate network performance without any training. However, they are still poorly understood, exhibit biases with network properties, and their performance is limited. Inspired by the drawbacks of zero-cost proxies, we propose neural graph features (GRAF), simple to compute properties of architectural graphs. GRAF offers fast and interpretable performance prediction while outperforming zero-cost proxies and other common encodings. In combination with other zero-cost proxies, GRAF outperforms most existing performance predictors at a fraction of the cost.
Related papers
- AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture Search [30.64117903216323]
Training-free network architecture search (NAS) aims to discover high-performing networks with zero-cost proxies.
We propose AZ-NAS, a novel approach that leverages the ensemble of various zero-cost proxies to enhance the correlation between a predicted ranking of networks and the ground truth.
Results conclusively demonstrate the efficacy and efficiency of AZ-NAS, outperforming state-of-the-art methods on standard benchmarks.
arXiv Detail & Related papers (2024-03-28T08:44:36Z) - The Sparsity Roofline: Understanding the Hardware Limits of Sparse
Neural Networks [4.130528857196844]
We introduce the Sparsity Roofline, a visual performance model for evaluating sparsity in neural networks.
We show how machine learning researchers can predict the performance of unimplemented or unoptimized block-structured sparsity patterns.
We show how hardware designers can predict the performance implications of new sparsity patterns and sparse data formats in hardware.
arXiv Detail & Related papers (2023-09-30T21:29:31Z) - AIO-P: Expanding Neural Performance Predictors Beyond Image
Classification [22.743278613519152]
We propose a novel All-in-One Predictor (AIO-P) to pretrain neural predictors on architecture examples.
AIO-P can achieve Mean Absolute Error (MAE) and Spearman's Rank Correlation (SRCC) below 1% and above 0.5, respectively.
arXiv Detail & Related papers (2022-11-30T18:30:41Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - How Powerful are Performance Predictors in Neural Architecture Search? [43.86743225322636]
We give the first large-scale study of performance predictors by analyzing 31 techniques.
We show that certain families of predictors can be combined to achieve even better predictive power.
arXiv Detail & Related papers (2021-04-02T17:57:16Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z) - Bayesian Neural Networks at Scale: A Performance Analysis and Pruning
Study [2.3605348648054463]
This work explores the use of high performance computing with distributed training to address the challenges of training BNNs at scale.
We present a performance and scalability comparison of training the VGG-16 and Resnet-18 models on a Cray-XC40 cluster.
arXiv Detail & Related papers (2020-05-23T23:15:34Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - EcoNAS: Finding Proxies for Economical Neural Architecture Search [130.59673917196994]
In this paper, we observe that most existing proxies exhibit different behaviors in maintaining the rank consistency among network candidates.
Inspired by these observations, we present a reliable proxy and further formulate a hierarchical proxy strategy.
The strategy spends more computations on candidate networks that are potentially more accurate, while discards unpromising ones in early stage with a fast proxy.
arXiv Detail & Related papers (2020-01-05T13:29:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.