Related papers: Quantifying lottery tickets under label noise: accuracy, calibration, and complexity

Quantifying lottery tickets under label noise: accuracy, calibration, and complexity

URL: http://arxiv.org/abs/2306.12190v1
Date: Wed, 21 Jun 2023 11:35:59 GMT
Title: Quantifying lottery tickets under label noise: accuracy, calibration, and complexity
Authors: Viplove Arora, Daniele Irto, Sebastian Goldt, Guido Sanguinetti
Abstract summary: Pruning deep neural networks is a widely used strategy to alleviate the computational burden in machine learning. We use the sparse double descent approach to identify univocally and characterise pruned models associated with classification tasks.
Score: 6.232071870655069
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pruning deep neural networks is a widely used strategy to alleviate the computational burden in machine learning. Overwhelming empirical evidence suggests that pruned models retain very high accuracy even with a tiny fraction of parameters. However, relatively little work has gone into characterising the small pruned networks obtained, beyond a measure of their accuracy. In this paper, we use the sparse double descent approach to identify univocally and characterise pruned models associated with classification tasks. We observe empirically that, for a given task, iterative magnitude pruning (IMP) tends to converge to networks of comparable sizes even when starting from full networks with sizes ranging over orders of magnitude. We analyse the best pruned models in a controlled experimental setup and show that their number of parameters reflects task difficulty and that they are much better than full networks at capturing the true conditional probability distribution of the labels. On real data, we similarly observe that pruned models are less prone to overconfident predictions. Our results suggest that pruned models obtained via IMP not only have advantageous computational properties but also provide a better representation of uncertainty in learning.

Related papers

Estimating Uncertainty with Implicit Quantile Network [0.0]
Uncertainty quantification is an important part of many performance critical applications. This paper provides a simple alternative to existing approaches such as ensemble learning and bayesian neural networks.
arXiv Detail & Related papers (2024-08-26T13:33:14Z)
Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance. Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z)
Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters. In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Efficient Stein Variational Inference for Reliable Distribution-lossless Network Pruning [23.22021752821507]
We propose a novel distribution-lossless pruning method, named vanillaP, to theoretically find the pruned lottery within Bayesian treatment. Our method can obtain sparser networks with great performance while providing quantified reliability for the pruned model.
arXiv Detail & Related papers (2022-12-07T09:31:47Z)
Mitigating Performance Saturation in Neural Marked Point Processes: Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers. We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
The Case for High-Accuracy Classification: Think Small, Think Many! [4.817521691828748]
We propose an efficient and lightweight deep classification ensemble structure based on a combination of simple color features. Our evaluation results show considerable improvements on the prediction accuracy compared to the popular ResNet-50 model.
arXiv Detail & Related papers (2021-03-18T16:15:31Z)
Robustness to Pruning Predicts Generalization in Deep Neural Networks [29.660568281957072]
We introduce prunability: the smallest emphfraction of a network's parameters that can be kept while pruning without adversely affecting its training loss. We show that this measure is highly predictive of a model's generalization performance across a large set of convolutional networks trained on CIFAR-10.
arXiv Detail & Related papers (2021-03-10T11:39:14Z)
Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy [42.15969584135412]
Neural network pruning is a popular technique used to reduce the inference costs of modern networks. We evaluate whether the use of test accuracy alone in the terminating condition is sufficient to ensure that the resulting model performs well. We find that pruned networks effectively approximate the unpruned model, however, the prune ratio at which pruned networks achieve commensurate performance varies significantly across tasks.
arXiv Detail & Related papers (2021-03-04T13:22:16Z)
Exploring Weight Importance and Hessian Bias in Model Pruning [55.75546858514194]
We provide a principled exploration of pruning by building on a natural notion of importance. For linear models, we show that this notion of importance is captured by scaling which connects to the well-known Hessian-based pruning algorithm. We identify settings in which weights become more important despite becoming smaller, which in turn leads to a catastrophic failure of magnitude-based pruning.
arXiv Detail & Related papers (2020-06-19T00:15:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.