Predicting Neural Network Accuracy from Weights
- URL: http://arxiv.org/abs/2002.11448v4
- Date: Fri, 9 Apr 2021 10:38:15 GMT
- Title: Predicting Neural Network Accuracy from Weights
- Authors: Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet,
Ilya Tolstikhin
- Abstract summary: We show experimentally that the accuracy of a trained neural network can be predicted surprisingly well by looking only at its weights.
We release a collection of 120k convolutional neural networks trained on four different datasets to encourage further research in this area.
- Score: 25.73213712719546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show experimentally that the accuracy of a trained neural network can be
predicted surprisingly well by looking only at its weights, without evaluating
it on input data. We motivate this task and introduce a formal setting for it.
Even when using simple statistics of the weights, the predictors are able to
rank neural networks by their performance with very high accuracy (R2 score
more than 0.98). Furthermore, the predictors are able to rank networks trained
on different, unobserved datasets and with different architectures. We release
a collection of 120k convolutional neural networks trained on four different
datasets to encourage further research in this area, with the goal of
understanding network training and performance better.
Related papers
- Verified Neural Compressed Sensing [58.98637799432153]
We develop the first (to the best of our knowledge) provably correct neural networks for a precise computational task.
We show that for modest problem dimensions (up to 50), we can train neural networks that provably recover a sparse vector from linear and binarized linear measurements.
We show that the complexity of the network can be adapted to the problem difficulty and solve problems where traditional compressed sensing methods are not known to provably work.
arXiv Detail & Related papers (2024-05-07T12:20:12Z) - FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search [10.699485270006601]
We introduce a novel Graph Neural Networks (GNN) predictor for Neural Architecture Search (NAS)
This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views.
The experimental results showcase a significant improvement in prediction accuracy, with a 3%--16% increase in Kendall-tau correlation.
arXiv Detail & Related papers (2024-04-24T03:22:49Z) - When do Convolutional Neural Networks Stop Learning? [0.0]
Convolutional Neural Networks (CNNs) have demonstrated outstanding performance in computer vision tasks.
Current practice is to stop training when the training loss decreases and the gap between training and validation error increases.
This research work introduces a hypothesis that analyzes the data variation across all the layers of a CNN variant to anticipate its near-optimal learning capacity.
arXiv Detail & Related papers (2024-03-04T20:35:09Z) - Neural Priming for Sample-Efficient Adaptation [92.14357804106787]
We propose Neural Priming, a technique for adapting large pretrained models to distribution shifts and downstream tasks.
Neural Priming can be performed at test time, even for pretraining as large as LAION-2B.
arXiv Detail & Related papers (2023-06-16T21:53:16Z) - Diffused Redundancy in Pre-trained Representations [98.55546694886819]
We take a closer look at how features are encoded in pre-trained representations.
We find that learned representations in a given layer exhibit a degree of diffuse redundancy.
Our findings shed light on the nature of representations learned by pre-trained deep neural networks.
arXiv Detail & Related papers (2023-05-31T21:00:50Z) - DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning [5.2319020651074215]
We propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP)
Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution.
We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors.
arXiv Detail & Related papers (2023-02-25T08:16:21Z) - Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone.
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions.
We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.