Quantifying lottery tickets under label noise: accuracy, calibration,
and complexity
- URL: http://arxiv.org/abs/2306.12190v1
- Date: Wed, 21 Jun 2023 11:35:59 GMT
- Title: Quantifying lottery tickets under label noise: accuracy, calibration,
and complexity
- Authors: Viplove Arora, Daniele Irto, Sebastian Goldt, Guido Sanguinetti
- Abstract summary: Pruning deep neural networks is a widely used strategy to alleviate the computational burden in machine learning.
We use the sparse double descent approach to identify univocally and characterise pruned models associated with classification tasks.
- Score: 6.232071870655069
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pruning deep neural networks is a widely used strategy to alleviate the
computational burden in machine learning. Overwhelming empirical evidence
suggests that pruned models retain very high accuracy even with a tiny fraction
of parameters. However, relatively little work has gone into characterising the
small pruned networks obtained, beyond a measure of their accuracy. In this
paper, we use the sparse double descent approach to identify univocally and
characterise pruned models associated with classification tasks. We observe
empirically that, for a given task, iterative magnitude pruning (IMP) tends to
converge to networks of comparable sizes even when starting from full networks
with sizes ranging over orders of magnitude. We analyse the best pruned models
in a controlled experimental setup and show that their number of parameters
reflects task difficulty and that they are much better than full networks at
capturing the true conditional probability distribution of the labels. On real
data, we similarly observe that pruned models are less prone to overconfident
predictions. Our results suggest that pruned models obtained via IMP not only
have advantageous computational properties but also provide a better
representation of uncertainty in learning.
Related papers
- Estimating Uncertainty with Implicit Quantile Network [0.0]
Uncertainty quantification is an important part of many performance critical applications.
This paper provides a simple alternative to existing approaches such as ensemble learning and bayesian neural networks.
arXiv Detail & Related papers (2024-08-26T13:33:14Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Efficient Stein Variational Inference for Reliable Distribution-lossless
Network Pruning [23.22021752821507]
We propose a novel distribution-lossless pruning method, named vanillaP, to theoretically find the pruned lottery within Bayesian treatment.
Our method can obtain sparser networks with great performance while providing quantified reliability for the pruned model.
arXiv Detail & Related papers (2022-12-07T09:31:47Z) - Mitigating Performance Saturation in Neural Marked Point Processes:
Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers.
We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - The Case for High-Accuracy Classification: Think Small, Think Many! [4.817521691828748]
We propose an efficient and lightweight deep classification ensemble structure based on a combination of simple color features.
Our evaluation results show considerable improvements on the prediction accuracy compared to the popular ResNet-50 model.
arXiv Detail & Related papers (2021-03-18T16:15:31Z) - Robustness to Pruning Predicts Generalization in Deep Neural Networks [29.660568281957072]
We introduce prunability: the smallest emphfraction of a network's parameters that can be kept while pruning without adversely affecting its training loss.
We show that this measure is highly predictive of a model's generalization performance across a large set of convolutional networks trained on CIFAR-10.
arXiv Detail & Related papers (2021-03-10T11:39:14Z) - Lost in Pruning: The Effects of Pruning Neural Networks beyond Test
Accuracy [42.15969584135412]
Neural network pruning is a popular technique used to reduce the inference costs of modern networks.
We evaluate whether the use of test accuracy alone in the terminating condition is sufficient to ensure that the resulting model performs well.
We find that pruned networks effectively approximate the unpruned model, however, the prune ratio at which pruned networks achieve commensurate performance varies significantly across tasks.
arXiv Detail & Related papers (2021-03-04T13:22:16Z) - Exploring Weight Importance and Hessian Bias in Model Pruning [55.75546858514194]
We provide a principled exploration of pruning by building on a natural notion of importance.
For linear models, we show that this notion of importance is captured by scaling which connects to the well-known Hessian-based pruning algorithm.
We identify settings in which weights become more important despite becoming smaller, which in turn leads to a catastrophic failure of magnitude-based pruning.
arXiv Detail & Related papers (2020-06-19T00:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.