Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets
- URL: http://arxiv.org/abs/2510.02818v1
- Date: Fri, 03 Oct 2025 08:50:44 GMT
- Title: Mitigating Spurious Correlation via Distributionally Robust Learning with Hierarchical Ambiguity Sets
- Authors: Sung Ho Jo, Seonghwi Kim, Minwoo Chae,
- Abstract summary: We propose a hierarchical extension of Group DRO that addresses both inter-group and intra-group uncertainties.<n>We also introduce new benchmark settings that simulate realistic minority group distribution shifts.<n>These results highlight the importance of broadening the ambiguity set to better capture both inter-group and intra-group distributional uncertainties.
- Score: 5.630530373119448
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional supervised learning methods are often vulnerable to spurious correlations, particularly under distribution shifts in test data. To address this issue, several approaches, most notably Group DRO, have been developed. While these methods are highly robust to subpopulation or group shifts, they remain vulnerable to intra-group distributional shifts, which frequently occur in minority groups with limited samples. We propose a hierarchical extension of Group DRO that addresses both inter-group and intra-group uncertainties, providing robustness to distribution shifts at multiple levels. We also introduce new benchmark settings that simulate realistic minority group distribution shifts-an important yet previously underexplored challenge in spurious correlation research. Our method demonstrates strong robustness under these conditions-where existing robust learning methods consistently fail-while also achieving superior performance on standard benchmarks. These results highlight the importance of broadening the ambiguity set to better capture both inter-group and intra-group distributional uncertainties.
Related papers
- Group Contrastive Learning for Weakly Paired Multimodal Data [34.76498775412033]
GROOVE is a semi-supervised multi-modal representation learning approach for high-content perturbation data.<n>GroupCLIP is a novel group-level contrastive loss that bridges the gap between CLIP for paired cross-modal data and SupCon for uni-modal supervised contrastive learning.
arXiv Detail & Related papers (2026-02-03T21:11:06Z) - Mitigating Clever Hans Strategies in Image Classifiers through Generating Counterexamples [15.618934546058277]
Group distributional robustness methods rely on explicit group labels to upweight underrepresented groups.<n>We propose Counterfactual Knowledge Distillation (CFKD), a framework that generates diverse counterfactuals.<n>We demonstrate CFKD's efficacy across five datasets, spanning synthetic tasks to an industrial application.
arXiv Detail & Related papers (2025-10-20T13:22:57Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Group conditional validity via multi-group learning [5.797821810358083]
We consider the problem of distribution-free conformal prediction and the criterion of group conditional validity.
Existing methods achieve such guarantees under either restrictive grouping structure or distributional assumptions.
We propose a simple reduction to the problem of achieving validity guarantees for individual populations by leveraging algorithms for a problem called multi-group learning.
arXiv Detail & Related papers (2023-03-07T15:51:03Z) - AGRO: Adversarial Discovery of Error-prone groups for Robust
Optimization [109.91265884632239]
Group distributionally robust optimization (G-DRO) can minimize the worst-case loss over a set of pre-defined groups over training data.
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization.
AGRO results in 8% higher model performance on average on known worst-groups, compared to prior group discovery approaches.
arXiv Detail & Related papers (2022-12-02T00:57:03Z) - Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain.
We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters.
We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z) - When Does Group Invariant Learning Survive Spurious Correlations? [29.750875769713513]
In this paper, we reveal the insufficiency of existing group invariant learning methods.
We propose two criteria on judging such sufficiency.
We show that existing methods can violate both criteria and thus fail in generalizing to spurious correlation shifts.
Motivated by this, we design a new group invariant learning method, which constructs groups with statistical independence tests.
arXiv Detail & Related papers (2022-06-29T11:16:11Z) - Focus on the Common Good: Group Distributional Robustness Follows [47.62596240492509]
This paper proposes a new and simple algorithm that explicitly encourages learning of features that are shared across various groups.
While Group-DRO focuses on groups with worst regularized loss, focusing instead, on groups that enable better performance even on other groups, could lead to learning of shared/common features.
arXiv Detail & Related papers (2021-10-06T09:47:41Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.