FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
- URL: http://arxiv.org/abs/2502.06695v1
- Date: Mon, 10 Feb 2025 17:18:54 GMT
- Title: FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
- Authors: Geraldin Nanfack, Eugene Belilovsky,
- Abstract summary: We show that models trained with empirical risk minimization tend to generalize well for examples from the majority groups while memorizing instances from minority groups.
We apply example-tied dropout as a method we term FairDropout, aimed at redirecting this memorization to specific neurons that we subsequently drop out during inference.
We empirically evaluate FairDropout using the subpopulation benchmark suite encompassing vision, language, and healthcare tasks, demonstrating that it significantly reduces reliance on spurious correlations, and outperforms state-of-the-art methods.
- Score: 10.274236106456758
- License:
- Abstract: Deep learning models frequently exploit spurious features in training data to achieve low training error, often resulting in poor generalization when faced with shifted testing distributions. To address this issue, various methods from imbalanced learning, representation learning, and classifier recalibration have been proposed to enhance the robustness of deep neural networks against spurious correlations. In this paper, we observe that models trained with empirical risk minimization tend to generalize well for examples from the majority groups while memorizing instances from minority groups. Building on recent findings that show memorization can be localized to a limited number of neurons, we apply example-tied dropout as a method we term FairDropout, aimed at redirecting this memorization to specific neurons that we subsequently drop out during inference. We empirically evaluate FairDropout using the subpopulation benchmark suite encompassing vision, language, and healthcare tasks, demonstrating that it significantly reduces reliance on spurious correlations, and outperforms state-of-the-art methods.
Related papers
- The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations [19.824897288786303]
This paper systematically shows the ubiquitous existence of spurious features in a small set of neurons within the network.
We find the property of a small subset of neurons or channels in memorizing minority group information.
To substantiate this hypothesis, we show that eliminating these unnecessary spurious memorization patterns via a novel framework during training can significantly affect the model performance on minority groups.
arXiv Detail & Related papers (2025-01-01T21:45:00Z) - Alpha and Prejudice: Improving $α$-sized Worst-case Fairness via Intrinsic Reweighting [34.954141077528334]
Worst-case fairness with off-the-shelf demographics group achieves parity by maximizing the model utility of the worst-off group.
Recent advances have reframed this learning problem by introducing the lower bound of minimal partition ratio.
arXiv Detail & Related papers (2024-11-05T13:04:05Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Adversarial Reweighting Guided by Wasserstein Distance for Bias
Mitigation [24.160692009892088]
Under-representation of minorities in the data makes the disparate treatment of subpopulations difficult to deal with during learning.
We propose a novel adversarial reweighting method to address such emphrepresentation bias.
arXiv Detail & Related papers (2023-11-21T15:46:11Z) - Social NCE: Contrastive Learning of Socially-aware Motion
Representations [87.82126838588279]
Experimental results show that the proposed method dramatically reduces the collision rates of recent trajectory forecasting, behavioral cloning and reinforcement learning algorithms.
Our method makes few assumptions about neural architecture designs, and hence can be used as a generic way to promote the robustness of neural motion models.
arXiv Detail & Related papers (2020-12-21T22:25:06Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - An Investigation of Why Overparameterization Exacerbates Spurious
Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior.
We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.