Model Patching: Closing the Subgroup Performance Gap with Data
Augmentation
- URL: http://arxiv.org/abs/2008.06775v1
- Date: Sat, 15 Aug 2020 20:01:23 GMT
- Title: Model Patching: Closing the Subgroup Performance Gap with Data
Augmentation
- Authors: Karan Goel, Albert Gu, Yixuan Li and Christopher R\'e
- Abstract summary: We introduce model patching, a framework for improving robustness of machine learning models.
Model patching encourages the model to be invariant to subgroup differences, and focus on class information shared by subgroups.
We instantiate model patching with CAMEL, which (1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and (2) balances subgroup performance using a theoretically-motivated consistency regularizer.
We demonstrate CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error up to 33% relative to the best baseline.
- Score: 50.35010342284508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classifiers in machine learning are often brittle when deployed. Particularly
concerning are models with inconsistent performance on specific subgroups of a
class, e.g., exhibiting disparities in skin cancer classification in the
presence or absence of a spurious bandage. To mitigate these performance
differences, we introduce model patching, a two-stage framework for improving
robustness that encourages the model to be invariant to subgroup differences,
and focus on class information shared by subgroups. Model patching first models
subgroup features within a class and learns semantic transformations between
them, and then trains a classifier with data augmentations that deliberately
manipulate subgroup features. We instantiate model patching with CAMEL, which
(1) uses a CycleGAN to learn the intra-class, inter-subgroup augmentations, and
(2) balances subgroup performance using a theoretically-motivated subgroup
consistency regularizer, accompanied by a new robust objective. We demonstrate
CAMEL's effectiveness on 3 benchmark datasets, with reductions in robust error
of up to 33% relative to the best baseline. Lastly, CAMEL successfully patches
a model that fails due to spurious features on a real-world skin cancer
dataset.
Related papers
- HM3: Heterogeneous Multi-Class Model Merging [0.0]
We explore training-free model merging techniques to consolidate auxiliary guard-rail models into a single, multi-functional model.
We propose Heterogeneous Multi-Class Model Merging (HM3) as a simple technique for merging multi-class classifiers with heterogeneous label spaces.
We report promising results for merging BERT-based guard models, some of which attain an average F1-score higher than the source models while reducing the inference time by up to 44%.
arXiv Detail & Related papers (2024-09-27T22:42:45Z) - The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations.
In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy.
Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z) - A Contrastive Learning Approach to Mitigate Bias in Speech Models [13.192011475857234]
We employ a three-level learning technique that guides the model in focusing on different scopes for the contrastive loss.
Experiments on two spoken language understanding datasets and two languages demonstrate that our approach improves internal subgroup representations.
arXiv Detail & Related papers (2024-06-20T19:20:00Z) - Unified Multi-View Orthonormal Non-Negative Graph Based Clustering
Framework [74.25493157757943]
We formulate a novel clustering model, which exploits the non-negative feature property and incorporates the multi-view information into a unified joint learning framework.
We also explore, for the first time, the multi-model non-negative graph-based approach to clustering data based on deep features.
arXiv Detail & Related papers (2022-11-03T08:18:27Z) - Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain.
We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters.
We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z) - Stacking Ensemble Learning in Deep Domain Adaptation for Ophthalmic
Image Classification [61.656149405657246]
Domain adaptation is effective in image classification tasks where obtaining sufficient label data is challenging.
We propose a novel method, named SELDA, for stacking ensemble learning via extending three domain adaptation methods.
The experimental results using Age-Related Eye Disease Study (AREDS) benchmark ophthalmic dataset demonstrate the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-27T14:19:00Z) - RealPatch: A Statistical Matching Framework for Model Patching with Real
Samples [6.245453620070586]
RealPatch is a framework for simpler, faster, and more data-efficient data augmentation based on statistical matching.
We show that RealPatch can successfully eliminate dataset leakage while reducing model leakage and maintaining high utility.
arXiv Detail & Related papers (2022-08-03T16:22:30Z) - Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning [141.35105358670316]
We study the difference between a na"ively-trained initial-phase model and the oracle model.
We propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly.
Our CwD is simple to implement and easy to plug into existing methods.
arXiv Detail & Related papers (2021-12-09T07:20:32Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Top-Related Meta-Learning Method for Few-Shot Object Detection [8.144721518458844]
We propose a Top-C classification loss (i.e., TCL-C) for classification task and a category-based grouping mechanism for category-based meta-features obtained by the meta-model.
Ours significantly outperforms previous state-of-the-art methods for few-shot detection.
arXiv Detail & Related papers (2020-07-14T05:52:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.