AIO-P: Expanding Neural Performance Predictors Beyond Image
Classification
- URL: http://arxiv.org/abs/2211.17228v2
- Date: Mon, 24 Apr 2023 20:07:09 GMT
- Title: AIO-P: Expanding Neural Performance Predictors Beyond Image
Classification
- Authors: Keith G. Mills, Di Niu, Mohammad Salameh, Weichen Qiu, Fred X. Han,
Puyuan Liu, Jialin Zhang, Wei Lu, Shangling Jui
- Abstract summary: We propose a novel All-in-One Predictor (AIO-P) to pretrain neural predictors on architecture examples.
AIO-P can achieve Mean Absolute Error (MAE) and Spearman's Rank Correlation (SRCC) below 1% and above 0.5, respectively.
- Score: 22.743278613519152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Evaluating neural network performance is critical to deep neural network
design but a costly procedure. Neural predictors provide an efficient solution
by treating architectures as samples and learning to estimate their performance
on a given task. However, existing predictors are task-dependent, predominantly
estimating neural network performance on image classification benchmarks. They
are also search-space dependent; each predictor is designed to make predictions
for a specific architecture search space with predefined topologies and set of
operations. In this paper, we propose a novel All-in-One Predictor (AIO-P),
which aims to pretrain neural predictors on architecture examples from
multiple, separate computer vision (CV) task domains and multiple architecture
spaces, and then transfer to unseen downstream CV tasks or neural
architectures. We describe our proposed techniques for general graph
representation, efficient predictor pretraining and knowledge infusion
techniques, as well as methods to transfer to downstream tasks/spaces.
Extensive experimental results show that AIO-P can achieve Mean Absolute Error
(MAE) and Spearman's Rank Correlation (SRCC) below 1% and above 0.5,
respectively, on a breadth of target downstream CV tasks with or without
fine-tuning, outperforming a number of baselines. Moreover, AIO-P can directly
transfer to new architectures not seen during training, accurately rank them
and serve as an effective performance estimator when paired with an algorithm
designed to preserve performance while reducing FLOPs.
Related papers
- POMONAG: Pareto-Optimal Many-Objective Neural Architecture Generator [4.09225917049674]
Transferable NAS has emerged, generalizing the search process from dataset-dependent to task-dependent.
This paper introduces POMONAG, extending DiffusionNAG via a many-optimal diffusion process.
Results were validated on two search spaces -- NAS201 and MobileNetV3 -- and evaluated across 15 image classification datasets.
arXiv Detail & Related papers (2024-09-30T16:05:29Z) - FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search [10.699485270006601]
We introduce a novel Graph Neural Networks (GNN) predictor for Neural Architecture Search (NAS)
This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views.
The experimental results showcase a significant improvement in prediction accuracy, with a 3%--16% increase in Kendall-tau correlation.
arXiv Detail & Related papers (2024-04-24T03:22:49Z) - Deep Learning Architectures for FSCV, a Comparison [0.0]
Suitability is determined by the predictive performance in the "out-of-probe" case, the response to artificially induced electrical noise, and the ability to predict when the model will be errant for a given probe.
The InceptionTime architecture, a deep convolutional neural network, has the best absolute predictive performance of the models tested but was more susceptible to noise.
A naive multilayer perceptron architecture had the second lowest prediction error and was less affected by the artificial noise, suggesting that convolutions may not be as important for this task as one might suspect.
arXiv Detail & Related papers (2022-12-05T00:20:10Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task.
Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z) - SCAI: A Spectral data Classification framework with Adaptive Inference
for the IoT platform [0.0]
We propose a Spectral data Classification framework with Adaptive Inference.
Specifically, to allocate different computations for different samples while better exploiting the collaboration among different devices.
To the best of our knowledge, this paper is the first attempt to conduct optimization by adaptive inference for spectral detection under the IoT platform.
arXiv Detail & Related papers (2022-06-24T09:22:52Z) - RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform
Successive Halving [74.61723678821049]
We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget.
We formulate predictor-based architecture search as learning to rank with pairwise comparisons.
The resulting method - RANK-NOSH, reduces the search budget by 5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.
arXiv Detail & Related papers (2021-08-18T07:45:21Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.