Accuracy Prediction for NAS Acceleration using Feature Selection and
Extrapolation
- URL: http://arxiv.org/abs/2211.12419v1
- Date: Tue, 22 Nov 2022 17:27:14 GMT
- Title: Accuracy Prediction for NAS Acceleration using Feature Selection and
Extrapolation
- Authors: Tal Hakim
- Abstract summary: Predicting the accuracy of candidate neural architectures is an important capability of NAS-based solutions.
We improve regression accuracy using feature selection, whereas the other one is the evaluation of regression algorithms.
The extended dataset and code used in the study have been made public in the NAAP-440 repository.
- Score: 1.2183405753834562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predicting the accuracy of candidate neural architectures is an important
capability of NAS-based solutions. When a candidate architecture has properties
that are similar to other known architectures, the prediction task is rather
straightforward using off-the-shelf regression algorithms. However, when a
candidate architecture lies outside of the known space of architectures, a
regression model has to perform extrapolated predictions, which is not only a
challenging task, but also technically impossible using the most popular
regression algorithm families, which are based on decision trees. In this work,
we are trying to address two problems. The first one is improving regression
accuracy using feature selection, whereas the other one is the evaluation of
regression algorithms on extrapolating accuracy prediction tasks. We extend the
NAAP-440 dataset with new tabular features and introduce NAAP-440e, which we
use for evaluation. We observe a dramatic improvement from the old baseline,
namely, the new baseline requires 3x shorter training processes of candidate
architectures, while maintaining the same mean-absolute-error and achieving
almost 2x fewer monotonicity violations, compared to the old baseline's best
reported performance. The extended dataset and code used in the study have been
made public in the NAAP-440 repository.
Related papers
- OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking [66.83273589348758]
Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph.
A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task.
New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
arXiv Detail & Related papers (2023-06-18T01:58:59Z) - NAAP-440 Dataset and Baseline for Neural Architecture Accuracy
Prediction [1.2183405753834562]
We introduce the NAAP-440 dataset of 440 neural architectures, which were trained on CIFAR10 using a fixed recipe.
Experiments indicate that by using off-the-shelf regression algorithms and running up to 10% of the training process, not only is it possible to predict an architecture's accuracy rather precisely.
This approach may serve as a powerful tool for accelerating NAS-based studies and thus dramatically increase their efficiency.
arXiv Detail & Related papers (2022-09-14T13:21:39Z) - A Hybrid Framework for Sequential Data Prediction with End-to-End
Optimization [0.0]
We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates hand-designed features and manual model selection issues.
We employ a recurrent neural network (LSTM) for adaptive feature extraction from sequential data and a gradient boosting machinery (soft GBDT) for effective supervised regression.
We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets.
arXiv Detail & Related papers (2022-03-25T17:13:08Z) - Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z) - Learning to Fit Morphable Models [12.469605679847085]
We build upon recent advances in learned optimization and propose an update rule inspired by the classic Levenberg-Marquardt algorithm.
We show the effectiveness of the proposed neural on the problems of 3D body surface estimation from a head-mounted device and face fitting from 2D landmarks.
arXiv Detail & Related papers (2021-11-29T18:59:53Z) - Backward-Compatible Prediction Updates: A Probabilistic Approach [12.049279991559091]
We formalize the Prediction Update Problem and present an efficient probabilistic approach as answer to the above questions.
In extensive experiments on standard classification benchmark data sets, we show that our method outperforms alternative strategies for backward-compatible prediction updates.
arXiv Detail & Related papers (2021-07-02T13:05:31Z) - Confidence Adaptive Anytime Pixel-Level Recognition [86.75784498879354]
Anytime inference requires a model to make a progression of predictions which might be halted at any time.
We propose the first unified and end-to-end model approach for anytime pixel-level recognition.
arXiv Detail & Related papers (2021-04-01T20:01:57Z) - Streaming Linear System Identification with Reverse Experience Replay [45.17023170054112]
We consider the problem of estimating a linear time-invariant (LTI) dynamical system from a single trajectory via streaming algorithms.
In many problems of interest as encountered in reinforcement learning (RL), it is important to estimate the parameters on the go using gradient oracle.
We propose a novel, SGD with Reverse Experience Replay (SGD-RER), that is inspired by the experience replay (ER) technique popular in the RL literature.
arXiv Detail & Related papers (2021-03-10T06:51:55Z) - Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor.
We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space.
Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z) - Accuracy Prediction with Non-neural Model for Neural Architecture Search [185.0651567642238]
We study an alternative approach which uses non-neural model for accuracy prediction.
We leverage gradient boosting decision tree (GBDT) as the predictor for Neural architecture search (NAS)
Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS.
arXiv Detail & Related papers (2020-07-09T13:28:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.