Clarify: Improving Model Robustness With Natural Language Corrections
- URL: http://arxiv.org/abs/2402.03715v1
- Date: Tue, 6 Feb 2024 05:11:38 GMT
- Title: Clarify: Improving Model Robustness With Natural Language Corrections
- Authors: Yoonho Lee, Michelle S. Lam, Helena Vasconcelos, Michael S. Bernstein,
Chelsea Finn
- Abstract summary: In supervised learning, models are trained to extract correlations from a static dataset.
This often leads to models that rely on high-level misconceptions.
We introduce Clarify, a novel interface and method for interactively correcting model misconceptions.
- Score: 63.342630414000006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In supervised learning, models are trained to extract correlations from a
static dataset. This often leads to models that rely on high-level
misconceptions. To prevent such misconceptions, we must necessarily provide
additional information beyond the training data. Existing methods incorporate
forms of additional instance-level supervision, such as labels for spurious
features or additional labeled data from a balanced distribution. Such
strategies can become prohibitively costly for large-scale datasets since they
require additional annotation at a scale close to the original training data.
We hypothesize that targeted natural language feedback about a model's
misconceptions is a more efficient form of additional supervision. We introduce
Clarify, a novel interface and method for interactively correcting model
misconceptions. Through Clarify, users need only provide a short text
description to describe a model's consistent failure patterns. Then, in an
entirely automated way, we use such descriptions to improve the training
process by reweighting the training data or gathering additional targeted data.
Our user studies show that non-expert users can successfully describe model
misconceptions via Clarify, improving worst-group accuracy by an average of
17.1% in two datasets. Additionally, we use Clarify to find and rectify 31
novel hard subpopulations in the ImageNet dataset, improving minority-split
accuracy from 21.1% to 28.7%.
Related papers
- Robust Data Pruning under Label Noise via Maximizing Re-labeling
Accuracy [34.02350195269502]
We formalize the problem of data pruning with re-labeling.
We propose a novel data pruning algorithm, Prune4Rel, that finds a subset maximizing the total neighborhood confidence of all training examples.
arXiv Detail & Related papers (2023-11-02T05:40:26Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - On-the-fly Denoising for Data Augmentation in Natural Language
Understanding [101.46848743193358]
We propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data.
Our method can be applied to general augmentation techniques and consistently improve the performance on both text classification and question-answering tasks.
arXiv Detail & Related papers (2022-12-20T18:58:33Z) - CLIP: Train Faster with Less Data [3.2575001434344286]
Deep learning models require an enormous amount of data for training.
Recently there is a shift in machine learning from model-centric to data-centric approaches.
We propose CLIP i.e., Curriculum Learning with Iterative data Pruning.
arXiv Detail & Related papers (2022-12-02T21:29:48Z) - Self Training with Ensemble of Teacher Models [8.257085583227695]
In order to train robust deep learning models, large amounts of labelled data is required.
In the absence of such large repositories of labelled data, unlabeled data can be exploited for the same.
Semi-Supervised learning aims to utilize such unlabeled data for training classification models.
arXiv Detail & Related papers (2021-07-17T09:44:09Z) - BiFair: Training Fair Models with Bilevel Optimization [8.2509884277533]
We develop a new training algorithm, named BiFair, which jointly minimizes for a utility, and a fairness loss of interest.
Our algorithm consistently performs better, i.e., we reach to better values of a given fairness metric under same, or higher accuracy.
arXiv Detail & Related papers (2021-06-03T22:36:17Z) - Whitening and second order optimization both make information in the
dataset unusable during training, and can reduce or prevent generalization [50.53690793828442]
We show that both data whitening and second order optimization can harm or entirely prevent generalization.
For a general class of models, namely models with a fully connected first layer, we prove that the information contained in this matrix is the only information which can be used to generalize.
arXiv Detail & Related papers (2020-08-17T18:00:05Z) - Learning from Imperfect Annotations [15.306536555936692]
Many machine learning systems today are trained on large amounts of human-annotated data.
We propose a new end-to-end framework that enables us to merge the aggregation step with model training.
We show accuracy gains of up to 25% over the current state-of-the-art approaches for aggregating annotations.
arXiv Detail & Related papers (2020-04-07T15:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.