Related papers: To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models

To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models

URL: http://arxiv.org/abs/2402.18803v1
Date: Thu, 29 Feb 2024 02:16:57 GMT
Title: To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models
Authors: Cyrus Cousins, I. Elizabeth Kumar, Suresh Venkatasubramanian
Abstract summary: We derive group-specific bounds on the generalization error of welfare-centric fair machine learning. We do this by considering group-specific Rademacher averages over a restricted hypothesis class. Our simulations demonstrate these bounds improve over a naive method, as expected by theory, with particularly significant improvement for smaller group sizes.
Score: 14.143499246740278
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In fair machine learning, one source of performance disparities between groups is over-fitting to groups with relatively few training samples. We derive group-specific bounds on the generalization error of welfare-centric fair machine learning that benefit from the larger sample size of the majority group. We do this by considering group-specific Rademacher averages over a restricted hypothesis class, which contains the family of models likely to perform well with respect to a fair learning objective (e.g., a power-mean). Our simulations demonstrate these bounds improve over a naive method, as expected by theory, with particularly significant improvement for smaller group sizes.

Related papers

An active learning framework for multi-group mean estimation [11.799152724436999]
We study a fundamental learning problem over multiple groups with unknown data distributions.<n>We propose an algorithm, Variance-UCB, that sequentially selects groups according to an upper confidence bound on the variance estimate.
arXiv Detail & Related papers (2025-05-20T20:13:04Z)
Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [53.96714099151378]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models. Our method improves its two key components: minority samples identification and the robust training algorithm. Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z)
The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations. In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy. Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z)
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance [64.1656365676171]
Group imbalance has been a known problem in empirical risk minimization. This paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance.
arXiv Detail & Related papers (2024-03-12T04:38:05Z)
Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm. In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample. In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z)
On The Impact of Machine Learning Randomness on Group Fairness [11.747264308336012]
We investigate the impact on group fairness of different sources of randomness in training neural networks. We show that the variance in group fairness measures is rooted in the high volatility of the learning process on under-represented groups. We show how one can control group-level accuracy, with high efficiency and negligible impact on the model's overall performance, by simply changing the data order for a single epoch.
arXiv Detail & Related papers (2023-07-09T09:36:31Z)
Learning Informative Representation for Fairness-aware Multivariate Time-series Forecasting: A Group-based Perspective [50.093280002375984]
Performance unfairness among variables widely exists in multivariate time series (MTS) forecasting models. We propose a novel framework, named FairFor, for fairness-aware MTS forecasting.
arXiv Detail & Related papers (2023-01-27T04:54:12Z)
Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain. We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters. We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z)
Fair Group-Shared Representations with Normalizing Flows [68.29997072804537]
We develop a fair representation learning algorithm which is able to map individuals belonging to different groups in a single group. We show experimentally that our methodology is competitive with other fair representation learning algorithms.
arXiv Detail & Related papers (2022-01-17T10:49:49Z)
Unified Group Fairness on Federated Learning [22.143427873780404]
Federated learning (FL) has emerged as an important machine learning paradigm where a global model is trained based on private data from distributed clients. Recent researches focus on achieving fairness among clients, but they ignore the fairness towards different groups formed by sensitive attribute(s) (e.g., gender and/or race) We propose a novel FL algorithm, named Group Distributionally Robust Federated Averaging (G-DRFA), which mitigates the distribution shift across groups with theoretical analysis of convergence rate.
arXiv Detail & Related papers (2021-11-09T08:21:38Z)
Federating for Learning Group Fair Models [19.99325961328706]
Federated learning is an increasingly popular paradigm that enables a large number of entities to collaboratively learn better models. We study minmax group fairness in paradigms where different participating entities may only have access to a subset of the population groups during the training phase.
arXiv Detail & Related papers (2021-10-05T12:42:43Z)
Just Train Twice: Improving Group Robustness without Training Group Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups. Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point. We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.