AGRO: Adversarial Discovery of Error-prone groups for Robust
Optimization
- URL: http://arxiv.org/abs/2212.00921v1
- Date: Fri, 2 Dec 2022 00:57:03 GMT
- Title: AGRO: Adversarial Discovery of Error-prone groups for Robust
Optimization
- Authors: Bhargavi Paranjape, Pradeep Dasigi, Vivek Srikumar, Luke Zettlemoyer
and Hannaneh Hajishirzi
- Abstract summary: Group distributionally robust optimization (G-DRO) can minimize the worst-case loss over a set of pre-defined groups over training data.
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization.
AGRO results in 8% higher model performance on average on known worst-groups, compared to prior group discovery approaches.
- Score: 109.91265884632239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Models trained via empirical risk minimization (ERM) are known to rely on
spurious correlations between labels and task-independent input features,
resulting in poor generalization to distributional shifts. Group
distributionally robust optimization (G-DRO) can alleviate this problem by
minimizing the worst-case loss over a set of pre-defined groups over training
data. G-DRO successfully improves performance of the worst-group, where the
correlation does not hold. However, G-DRO assumes that the spurious
correlations and associated worst groups are known in advance, making it
challenging to apply it to new tasks with potentially multiple unknown spurious
correlations. We propose AGRO -- Adversarial Group discovery for
Distributionally Robust Optimization -- an end-to-end approach that jointly
identifies error-prone groups and improves accuracy on them. AGRO equips G-DRO
with an adversarial slicing model to find a group assignment for training
examples which maximizes worst-case loss over the discovered groups. On the
WILDS benchmark, AGRO results in 8% higher model performance on average on
known worst-groups, compared to prior group discovery approaches used with
G-DRO. AGRO also improves out-of-distribution performance on SST2, QQP, and
MS-COCO -- datasets where potential spurious correlations are as yet
uncharacterized. Human evaluation of ARGO groups shows that they contain
well-defined, yet previously unstudied spurious correlations that lead to model
errors.
Related papers
- Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target.
This can degrade the performance on underrepresented (or'minority') groups that lack these attributes.
We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z) - Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference [15.874604623294427]
Standard empirical risk (ERM) models may prioritize learning spurious correlations between spurious features and true labels, leading to poor accuracy on groups where these correlations do not hold.
We propose GIC, a novel method that accurately infers group labels, resulting in improved worst-group performance.
arXiv Detail & Related papers (2024-04-22T01:28:35Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Take One Gram of Neural Features, Get Enhanced Group Robustness [23.541213868620837]
Predictive performance of machine learning models trained with empirical risk minimization can degrade considerably under distribution shifts.
We propose to partition the training dataset into groups based on Gram matrices of features extracted by an identification'' model.
Our approach not only improves group robustness over ERM but also outperforms all recent baselines.
arXiv Detail & Related papers (2022-08-26T12:34:55Z) - Correct-N-Contrast: A Contrastive Approach for Improving Robustness to
Spurious Correlations [59.24031936150582]
Spurious correlations pose a major challenge for robust machine learning.
Models trained with empirical risk minimization (ERM) may learn to rely on correlations between class labels and spurious attributes.
We propose Correct-N-Contrast (CNC), a contrastive approach to directly learn representations robust to spurious correlations.
arXiv Detail & Related papers (2022-03-03T05:03:28Z) - Focus on the Common Good: Group Distributional Robustness Follows [47.62596240492509]
This paper proposes a new and simple algorithm that explicitly encourages learning of features that are shared across various groups.
While Group-DRO focuses on groups with worst regularized loss, focusing instead, on groups that enable better performance even on other groups, could lead to learning of shared/common features.
arXiv Detail & Related papers (2021-10-06T09:47:41Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.