Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
- URL: http://arxiv.org/abs/2304.12622v1
- Date: Tue, 25 Apr 2023 07:42:06 GMT
- Title: Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
- Authors: Eugenia Iofinova, Alexandra Peste, Dan Alistarh
- Abstract summary: Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression.
Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
- Score: 93.17009514112702
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pruning - that is, setting a significant subset of the parameters of a neural
network to zero - is one of the most popular methods of model compression. Yet,
several recent works have raised the issue that pruning may induce or
exacerbate bias in the output of the compressed model. Despite existing
evidence for this phenomenon, the relationship between neural network pruning
and induced bias is not well-understood. In this work, we systematically
investigate and characterize this phenomenon in Convolutional Neural Networks
for computer vision. First, we show that it is in fact possible to obtain
highly-sparse models, e.g. with less than 10% remaining weights, which do not
decrease in accuracy nor substantially increase in bias when compared to dense
models. At the same time, we also find that, at higher sparsities, pruned
models exhibit higher uncertainty in their outputs, as well as increased
correlations, which we directly link to increased bias. We propose easy-to-use
criteria which, based only on the uncompressed model, establish whether bias
will increase with pruning, and identify the samples most susceptible to biased
predictions post-compression.
Related papers
- Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think [53.2706196341054]
We show that the perceived inefficiency was caused by a flaw in the inference pipeline that has so far gone unnoticed.
We perform end-to-end fine-tuning on top of the single-step model with task-specific losses and get a deterministic model that outperforms all other diffusion-based depth and normal estimation models.
arXiv Detail & Related papers (2024-09-17T16:58:52Z) - Looking at Model Debiasing through the Lens of Anomaly Detection [11.113718994341733]
Deep neural networks are sensitive to bias in the data.
We propose a new bias identification method based on anomaly detection.
We reach state-of-the-art performance on synthetic and real benchmark datasets.
arXiv Detail & Related papers (2024-07-24T17:30:21Z) - A U-turn on Double Descent: Rethinking Parameter Counting in Statistical
Learning [68.76846801719095]
We show that double descent appears exactly when and where it occurs, and that its location is not inherently tied to the threshold p=n.
This provides a resolution to tensions between double descent and statistical intuition.
arXiv Detail & Related papers (2023-10-29T12:05:39Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - How Tempering Fixes Data Augmentation in Bayesian Neural Networks [22.188535244056016]
We show that tempering implicitly reduces the misspecification arising from modeling augmentations as i.i.d. data.
The temperature mimics the role of the effective sample size, reflecting the gain in information provided by the augmentations.
arXiv Detail & Related papers (2022-05-27T11:06:56Z) - Model Compression for Dynamic Forecast Combination [9.281199058905017]
We show that compressing dynamic forecasting ensembles into an individual model leads to a comparable predictive performance.
We also show that the compressed individual model with best average rank is a rule-based regression model.
arXiv Detail & Related papers (2021-04-05T09:55:35Z) - The Neural Tangent Kernel in High Dimensions: Triple Descent and a
Multi-Scale Theory of Generalization [34.235007566913396]
Modern deep learning models employ considerably more parameters than required to fit the training data. Whereas conventional statistical wisdom suggests such models should drastically overfit, in practice these models generalize remarkably well.
An emerging paradigm for describing this unexpected behavior is in terms of a emphdouble descent curve.
We provide a precise high-dimensional analysis of generalization with the Neural Tangent Kernel, which characterizes the behavior of wide neural networks with gradient descent.
arXiv Detail & Related papers (2020-08-15T20:55:40Z) - Learning from Failure: Training Debiased Classifier from Biased
Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge.
We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously.
Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z) - Rethinking Bias-Variance Trade-off for Generalization of Neural Networks [40.04927952870877]
We provide a simple explanation for this by measuring the bias and variance of neural networks.
We find that variance unimodality occurs robustly for all models we considered.
Deeper models decrease bias and increase variance for both in-distribution and out-of-distribution data.
arXiv Detail & Related papers (2020-02-26T07:21:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.