Fairness via In-Processing in the Over-parameterized Regime: A
Cautionary Tale
- URL: http://arxiv.org/abs/2206.14853v1
- Date: Wed, 29 Jun 2022 18:40:35 GMT
- Title: Fairness via In-Processing in the Over-parameterized Regime: A
Cautionary Tale
- Authors: Akshaj Kumar Veldanda, Ivan Brugere, Jiahao Chen, Sanghamitra Dutta,
Alan Mishler, Siddharth Garg
- Abstract summary: MinDiff is a fairness-constrained training training procedure that aims to achieve Equality of Opportunity.
We show that although MinDiff improves fairness for under-constrained models, it is likely to be ineffective in the over-constrained regime.
We suggest using previously proposed regularization techniques, L2, early stopping and flooding, in conjunction with MinDiff to train fair over-constrainedized models.
- Score: 15.966815398160742
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of DNNs is driven by the counter-intuitive ability of
over-parameterized networks to generalize, even when they perfectly fit the
training data. In practice, test error often continues to decrease with
increasing over-parameterization, referred to as double descent. This allows
practitioners to instantiate large models without having to worry about
over-fitting. Despite its benefits, however, prior work has shown that
over-parameterization can exacerbate bias against minority subgroups. Several
fairness-constrained DNN training methods have been proposed to address this
concern. Here, we critically examine MinDiff, a fairness-constrained training
procedure implemented within TensorFlow's Responsible AI Toolkit, that aims to
achieve Equality of Opportunity. We show that although MinDiff improves
fairness for under-parameterized models, it is likely to be ineffective in the
over-parameterized regime. This is because an overfit model with zero training
loss is trivially group-wise fair on training data, creating an "illusion of
fairness," thus turning off the MinDiff optimization (this will apply to any
disparity-based measures which care about errors or accuracy. It won't apply to
demographic parity). Within specified fairness constraints, under-parameterized
MinDiff models can even have lower error compared to their over-parameterized
counterparts (despite baseline over-parameterized models having lower error).
We further show that MinDiff optimization is very sensitive to choice of batch
size in the under-parameterized regime. Thus, fair model training using MinDiff
requires time-consuming hyper-parameter searches. Finally, we suggest using
previously proposed regularization techniques, viz. L2, early stopping and
flooding in conjunction with MinDiff to train fair over-parameterized models.
Related papers
- LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - FairTune: Optimizing Parameter Efficient Fine Tuning for Fairness in
Medical Image Analysis [15.166588667072888]
Training models with robust group fairness properties is crucial in ethically sensitive application areas such as medical diagnosis.
High-capacity deep learning models can fit all training data nearly perfectly, and thus also exhibit perfect fairness during training.
We propose FairTune, a framework to optimise the choice of PEFT parameters with respect to fairness.
arXiv Detail & Related papers (2023-10-08T07:41:15Z) - FairAdaBN: Mitigating unfairness with adaptive batch normalization and
its application to dermatological disease classification [14.589159162086926]
We propose FairAdaBN, which makes batch normalization adaptive to sensitive attribute.
We propose a new metric, named Fairness-Accuracy Trade-off Efficiency (FATE), to compute normalized fairness improvement over accuracy drop.
Experiments on two dermatological datasets show that our proposed method outperforms other methods on fairness criteria and FATE.
arXiv Detail & Related papers (2023-03-15T02:22:07Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Memorizing without overfitting: Bias, variance, and interpolation in
over-parameterized models [0.0]
The bias-variance trade-off is a central concept in supervised learning.
Modern Deep Learning methods flout this dogma, achieving state-of-the-art performance.
arXiv Detail & Related papers (2020-10-26T22:31:04Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.