Related papers: Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures

Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures

URL: http://arxiv.org/abs/2304.12622v1
Date: Tue, 25 Apr 2023 07:42:06 GMT
Title: Bias in Pruned Vision Models: In-Depth Analysis and Countermeasures
Authors: Eugenia Iofinova, Alexandra Peste, Dan Alistarh
Abstract summary: Pruning, setting a significant subset of the parameters of a neural network to zero, is one of the most popular methods of model compression. Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood.
Score: 93.17009514112702
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pruning - that is, setting a significant subset of the parameters of a neural network to zero - is one of the most popular methods of model compression. Yet, several recent works have raised the issue that pruning may induce or exacerbate bias in the output of the compressed model. Despite existing evidence for this phenomenon, the relationship between neural network pruning and induced bias is not well-understood. In this work, we systematically investigate and characterize this phenomenon in Convolutional Neural Networks for computer vision. First, we show that it is in fact possible to obtain highly-sparse models, e.g. with less than 10% remaining weights, which do not decrease in accuracy nor substantially increase in bias when compared to dense models. At the same time, we also find that, at higher sparsities, pruned models exhibit higher uncertainty in their outputs, as well as increased correlations, which we directly link to increased bias. We propose easy-to-use criteria which, based only on the uncompressed model, establish whether bias will increase with pruning, and identify the samples most susceptible to biased predictions post-compression.

Related papers

Bias Amplification: Large Language Models as Increasingly Biased Media [12.376194654498383]
We study the progressive reinforcement of preexisting social biases in Large Language Models (LLMs) Our findings reveal a progressively increasing right-leaning bias. A mechanistic interpretation identifies distinct sets of neurons responsible for model collapse and bias amplification, suggesting they arise from different underlying mechanisms.
arXiv Detail & Related papers (2024-10-19T22:53:27Z)
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think [53.2706196341054]
We show that the perceived inefficiency was caused by a flaw in the inference pipeline that has so far gone unnoticed. We perform end-to-end fine-tuning on top of the single-step model with task-specific losses and get a deterministic model that outperforms all other diffusion-based depth and normal estimation models.
arXiv Detail & Related papers (2024-09-17T16:58:52Z)
Looking at Model Debiasing through the Lens of Anomaly Detection [11.113718994341733]
Deep neural networks are sensitive to bias in the data. We propose a new bias identification method based on anomaly detection. We reach state-of-the-art performance on synthetic and real benchmark datasets.
arXiv Detail & Related papers (2024-07-24T17:30:21Z)
A U-turn on Double Descent: Rethinking Parameter Counting in Statistical Learning [68.76846801719095]
We show that double descent appears exactly when and where it occurs, and that its location is not inherently tied to the threshold p=n. This provides a resolution to tensions between double descent and statistical intuition.
arXiv Detail & Related papers (2023-10-29T12:05:39Z)
Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability. We propose a self-supervised debiasing framework potentially compatible with unlabeled samples. Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
How Tempering Fixes Data Augmentation in Bayesian Neural Networks [22.188535244056016]
We show that tempering implicitly reduces the misspecification arising from modeling augmentations as i.i.d. data. The temperature mimics the role of the effective sample size, reflecting the gain in information provided by the augmentations.
arXiv Detail & Related papers (2022-05-27T11:06:56Z)
Model Compression for Dynamic Forecast Combination [9.281199058905017]
We show that compressing dynamic forecasting ensembles into an individual model leads to a comparable predictive performance. We also show that the compressed individual model with best average rank is a rule-based regression model.
arXiv Detail & Related papers (2021-04-05T09:55:35Z)
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization [34.235007566913396]
Modern deep learning models employ considerably more parameters than required to fit the training data. Whereas conventional statistical wisdom suggests such models should drastically overfit, in practice these models generalize remarkably well. An emerging paradigm for describing this unexpected behavior is in terms of a emphdouble descent curve. We provide a precise high-dimensional analysis of generalization with the Neural Tangent Kernel, which characterizes the behavior of wide neural networks with gradient descent.
arXiv Detail & Related papers (2020-08-15T20:55:40Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
Rethinking Bias-Variance Trade-off for Generalization of Neural Networks [40.04927952870877]
We provide a simple explanation for this by measuring the bias and variance of neural networks. We find that variance unimodality occurs robustly for all models we considered. Deeper models decrease bias and increase variance for both in-distribution and out-of-distribution data.
arXiv Detail & Related papers (2020-02-26T07:21:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.