Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions
- URL: http://arxiv.org/abs/2503.07315v1
- Date: Mon, 10 Mar 2025 13:34:18 GMT
- Title: Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions
- Authors: Rui Qiao, Zhaoxuan Wu, Jingtan Wang, Pang Wei Koh, Bryan Kian Hsiang Low,
- Abstract summary: We introduce Group-robust Sample Reweighting (GSR), a two-stage approach that first learns the representations from group-unlabeled data.<n>GSR is theoretically sound, practically lightweight, and effective in improving the robustness to subpopulation shifts.
- Score: 37.0753553356624
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models often have uneven performance among subpopulations (a.k.a., groups) in the data distributions. This poses a significant challenge for the models to generalize when the proportions of the groups shift during deployment. To improve robustness to such shifts, existing approaches have developed strategies that train models or perform hyperparameter tuning using the group-labeled data to minimize the worst-case loss over groups. However, a non-trivial amount of high-quality labels is often required to obtain noticeable improvements. Given the costliness of the labels, we propose to adopt a different paradigm to enhance group label efficiency: utilizing the group-labeled data as a target set to optimize the weights of other group-unlabeled data. We introduce Group-robust Sample Reweighting (GSR), a two-stage approach that first learns the representations from group-unlabeled data, and then tinkers the model by iteratively retraining its last layer on the reweighted data using influence functions. Our GSR is theoretically sound, practically lightweight, and effective in improving the robustness to subpopulation shifts. In particular, GSR outperforms the previous state-of-the-art approaches that require the same amount or even more group labels.
Related papers
- Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [53.96714099151378]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models.
Our method improves its two key components: minority samples identification and the robust training algorithm.
Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z) - Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target.
This can degrade the performance on underrepresented (or'minority') groups that lack these attributes.
We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z) - Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors [46.03963664373476]
We develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well under subpopulation shifts.
We demonstrate that training withGAP yields state-of-the-art performance -- even when only retraining the final layer of a previously trained non-robust model.
arXiv Detail & Related papers (2024-03-14T21:00:26Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Ranking & Reweighting Improves Group Distributional Robustness [14.021069321266516]
We propose a ranking-based training method called Discounted Rank Upweighting (DRU) to learn models that exhibit strong OOD performance on the test data.
Results on several synthetic and real-world datasets highlight the superior ability of our group-ranking-based (akin to soft-minimax) approach in selecting and learning models that are robust to group distributional shifts.
arXiv Detail & Related papers (2023-05-09T20:37:16Z) - AGRO: Adversarial Discovery of Error-prone groups for Robust
Optimization [109.91265884632239]
Group distributionally robust optimization (G-DRO) can minimize the worst-case loss over a set of pre-defined groups over training data.
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization.
AGRO results in 8% higher model performance on average on known worst-groups, compared to prior group discovery approaches.
arXiv Detail & Related papers (2022-12-02T00:57:03Z) - Take One Gram of Neural Features, Get Enhanced Group Robustness [23.541213868620837]
Predictive performance of machine learning models trained with empirical risk minimization can degrade considerably under distribution shifts.
We propose to partition the training dataset into groups based on Gram matrices of features extracted by an identification'' model.
Our approach not only improves group robustness over ERM but also outperforms all recent baselines.
arXiv Detail & Related papers (2022-08-26T12:34:55Z) - Improved Group Robustness via Classifier Retraining on Independent
Splits [6.930560177764658]
Group distributionally robust optimization is a widely used baseline for learning models with strong worst-group performance.
This paper designs a simple method based on the idea of retraining on independent splits of the training data.
We find that using a novel sample-splitting procedure achieves robust worst-group performance in the fine-tuning step.
arXiv Detail & Related papers (2022-04-20T16:22:27Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Overcoming Data Sparsity in Group Recommendation [52.00998276970403]
Group recommender systems should be able to accurately learn not only users' personal preferences but also preference aggregation strategy.
In this paper, we take Bipartite Graphding Model (BGEM), the self-attention mechanism and Graph Convolutional Networks (GCNs) as basic building blocks to learn group and user representations in a unified way.
arXiv Detail & Related papers (2020-10-02T07:11:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.