GreenFactory: Ensembling Zero-Cost Proxies to Estimate Performance of Neural Networks
- URL: http://arxiv.org/abs/2505.09344v1
- Date: Wed, 14 May 2025 12:40:34 GMT
- Title: GreenFactory: Ensembling Zero-Cost Proxies to Estimate Performance of Neural Networks
- Authors: Gabriel Cortês, Nuno Lourenço, Paolo Romano, Penousal Machado,
- Abstract summary: GreenFactory is an ensemble of zero-cost proxies that directly predict model test accuracy.<n>We evaluate GreenFactory on NATS-Bench, achieving robust results across multiple datasets.
- Score: 1.6986870945319288
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Determining the performance of a Deep Neural Network during Neural Architecture Search processes is essential for identifying optimal architectures and hyperparameters. Traditionally, this process requires training and evaluation of each network, which is time-consuming and resource-intensive. Zero-cost proxies estimate performance without training, serving as an alternative to traditional training. However, recent proxies often lack generalization across diverse scenarios and provide only relative rankings rather than predicted accuracies. To address these limitations, we propose GreenFactory, an ensemble of zero-cost proxies that leverages a random forest regressor to combine multiple predictors' strengths and directly predict model test accuracy. We evaluate GreenFactory on NATS-Bench, achieving robust results across multiple datasets. Specifically, GreenFactory achieves high Kendall correlations on NATS-Bench-SSS, indicating substantial agreement between its predicted scores and actual performance: 0.907 for CIFAR-10, 0.945 for CIFAR-100, and 0.920 for ImageNet-16-120. Similarly, on NATS-Bench-TSS, we achieve correlations of 0.921 for CIFAR-10, 0.929 for CIFAR-100, and 0.908 for ImageNet-16-120, showcasing its reliability in both search spaces.
Related papers
- RGB-Event Fusion with Self-Attention for Collision Prediction [9.268995547414777]
This paper proposes a neural network framework for predicting the time and collision position of an unmanned aerial vehicle with a dynamic object.<n>The proposed architecture consists of two separate encoder branches, one for each modality, followed by fusion by self-attention to improve prediction accuracy.<n>We show that the fusion-based model offers an improvement in prediction accuracy over single-modality approaches of 1% on average and 10% for distances beyond 0.5m, but comes at the cost of +71% in memory and + 105% in FLOPs.
arXiv Detail & Related papers (2025-05-07T09:03:26Z) - GreenMachine: Automatic Design of Zero-Cost Proxies for Energy-Efficient NAS [0.8192907805418583]
This paper addresses the challenges of model evaluation by automatically designing zero-cost proxies to assess Deep Neural Networks (DNNs) efficiently.
Our method begins with a randomly generated set of zero-cost proxies, which are evolved and tested using the NATS-Bench benchmark.
Results show our method outperforms existing approaches on the stratified sampling strategy.
arXiv Detail & Related papers (2024-11-22T17:24:19Z) - EffiSegNet: Gastrointestinal Polyp Segmentation through a Pre-Trained EfficientNet-based Network with a Simplified Decoder [0.8892527836401773]
This work introduces EffiSegNet, a novel segmentation framework leveraging transfer learning with a pre-trained Convolutional Neural Network (CNN) as its backbone.
We evaluate our model on the gastrointestinal polyp segmentation task using the publicly available Kvasir-SEG dataset, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-07-23T08:54:55Z) - Randomness Helps Rigor: A Probabilistic Learning Rate Scheduler Bridging Theory and Deep Learning Practice [7.494722456816369]
We propose a probabilistic learning rate scheduler (PLRS)<n>PLRS does not conform to the monotonically decreasing condition, with provable convergence guarantees.<n>We show that PLRS performs as well as or better than existing state-of-the-art learning rate schedulers in terms of convergence as well as accuracy.
arXiv Detail & Related papers (2024-07-10T12:52:24Z) - TG-NAS: Leveraging Zero-Cost Proxies with Transformer and Graph Convolution Networks for Efficient Neural Architecture Search [1.30891455653235]
TG-NAS aims to create training-free proxies for architecture performance prediction.
We introduce TG-NAS, a novel model-based universal proxy that leverages a transformer-based operator embedding generator and a graph convolution network (GCN) to predict architecture performance.
TG-NAS achieves up to 300X improvements in search efficiency compared to previous SOTA ZC proxy methods.
arXiv Detail & Related papers (2024-03-30T07:25:30Z) - SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS [35.041289296298565]
Training-free metrics are widely used to avoid resource-intensive neural network training.
We propose Sample-Wise Activation Patterns and its derivative, SWAP-Score, a novel high-performance training-free metric.
The SWAP-Score is strongly correlated with ground-truth performance across various search spaces and tasks.
arXiv Detail & Related papers (2024-03-07T02:40:42Z) - Robust representations of oil wells' intervals via sparse attention
mechanism [2.604557228169423]
We introduce the class of efficient Transformers named Regularized Transformers (Reguformers)
The focus in our experiments is on oil&gas data, namely, well logs.
To evaluate our models for such problems, we work with an industry-scale open dataset consisting of well logs of more than 20 wells.
arXiv Detail & Related papers (2022-12-29T09:56:33Z) - EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for
Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups.
Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K.
Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Zero-Cost Proxies for Lightweight NAS [19.906217380811373]
We evaluate conventional reduced-training proxies and quantify how well they preserve ranking between multiple models during search.
We propose a series of zero-cost proxies that use just a single minibatch of training data to compute a model's score.
Our zero-cost proxies use 3 orders of magnitude less computation but can match and even outperform conventional proxies.
arXiv Detail & Related papers (2021-01-20T13:59:52Z) - APQ: Joint Search for Network Architecture, Pruning and Quantization
Policy [49.3037538647714]
We present APQ for efficient deep learning inference on resource-constrained hardware.
Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner.
With the same accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ.
arXiv Detail & Related papers (2020-06-15T16:09:17Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.