Related papers: Model Patching: Closing the Subgroup Performance Gap with Data Augmentation

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation

URL: http://arxiv.org/abs/2008.06775v1
Date: Sat, 15 Aug 2020 20:01:23 GMT
Title: Model Patching: Closing the Subgroup Performance Gap with Data Augmentation
Authors: Karan Goel, Albert Gu, Yixuan Li and Christopher R\'e
Abstract summary: We introduce model patching, a framework for improving robustness of machine learning models. Model patching encourages the model to be invariant to subgroup differences, and focus on class information shared by subgroups. We instantiate model patching with CAMEL, which (1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and (2) balances subgroup performance using a theoretically-motivated consistency regularizer. We demonstrate CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error up to 33% relative to the best baseline.
Score: 50.35010342284508
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Classifiers in machine learning are often brittle when deployed. Particularly concerning are models with inconsistent performance on specific subgroups of a class, e.g., exhibiting disparities in skin cancer classification in the presence or absence of a spurious bandage. To mitigate these performance differences, we introduce model patching, a two-stage framework for improving robustness that encourages the model to be invariant to subgroup differences, and focus on class information shared by subgroups. Model patching first models subgroup features within a class and learns semantic transformations between them, and then trains a classifier with data augmentations that deliberately manipulate subgroup features. We instantiate model patching with CAMEL, which (1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and (2) balances subgroup performance using a theoretically-motivated subgroup consistency regularizer, accompanied by a new robust objective. We demonstrate CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error of up to 33% relative to the best baseline. Lastly, CAMEL successfully patches a model that fails due to spurious features on a real-world skin cancer dataset.

Related papers

Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness [53.96714099151378]
We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models. Our method improves its two key components: minority samples identification and the robust training algorithm. Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
arXiv Detail & Related papers (2025-03-12T15:46:12Z)
Order-Robust Class Incremental Learning: Graph-Driven Dynamic Similarity Grouping [19.168022702075774]
Class Incremental Learning (CIL) aims to enable models to learn new classes sequentially while retaining knowledge of previous ones. Recent studies highlight that the performance of CIL models is highly sensitive to the order of class arrival. We propose Graph-Driven Dynamic Similarity Grouping (GDDSG), a novel method that employs graph coloring algorithms to dynamically partition classes into similarity-constrained groups.
arXiv Detail & Related papers (2025-02-27T12:16:57Z)
HM3: Heterogeneous Multi-Class Model Merging [0.0]
We explore training-free model merging techniques to consolidate auxiliary guard-rail models into a single, multi-functional model. We propose Heterogeneous Multi-Class Model Merging (HM3) as a simple technique for merging multi-class classifiers with heterogeneous label spaces. We report promising results for merging BERT-based guard models, some of which attain an average F1-score higher than the source models while reducing the inference time by up to 44%.
arXiv Detail & Related papers (2024-09-27T22:42:45Z)
The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations. In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy. Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z)
A Contrastive Learning Approach to Mitigate Bias in Speech Models [13.192011475857234]
We employ a three-level learning technique that guides the model in focusing on different scopes for the contrastive loss. Experiments on two spoken language understanding datasets and two languages demonstrate that our approach improves internal subgroup representations.
arXiv Detail & Related papers (2024-06-20T19:20:00Z)
Unified Multi-View Orthonormal Non-Negative Graph Based Clustering Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework. We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z)
Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain. We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters. We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z)
Stacking Ensemble Learning in Deep Domain Adaptation for Ophthalmic Image Classification [61.656149405657246]
Domain adaptation is effective in image classification tasks where obtaining sufficient label data is challenging. We propose a novel method, named SELDA, for stacking ensemble learning via extending three domain adaptation methods. The experimental results using Age-Related Eye Disease Study (AREDS) benchmark ophthalmic dataset demonstrate the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-27T14:19:00Z)
RealPatch: A Statistical Matching Framework for Model Patching with Real Samples [6.245453620070586]
RealPatch is a framework for simpler, faster, and more data-efficient data augmentation based on statistical matching. We show that RealPatch can successfully eliminate dataset leakage while reducing model leakage and maintaining high utility.
arXiv Detail & Related papers (2022-08-03T16:22:30Z)
Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning [141.35105358670316]
We study the difference between a na"ively-trained initial-phase model and the oracle model. We propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly. Our CwD is simple to implement and easy to plug into existing methods.
arXiv Detail & Related papers (2021-12-09T07:20:32Z)
Just Train Twice: Improving Group Robustness without Training Group Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups. Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point. We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z)
Top-Related Meta-Learning Method for Few-Shot Object Detection [8.144721518458844]
We propose a Top-C classification loss (i.e., TCL-C) for classification task and a category-based grouping mechanism for category-based meta-features obtained by the meta-model. Ours significantly outperforms previous state-of-the-art methods for few-shot detection.
arXiv Detail & Related papers (2020-07-14T05:52:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.