Estimating Example Difficulty Using Variance of Gradients
- URL: http://arxiv.org/abs/2008.11600v4
- Date: Tue, 21 Jun 2022 04:22:54 GMT
- Title: Estimating Example Difficulty Using Variance of Gradients
- Authors: Chirag Agarwal, Daniel D'souza, Sara Hooker
- Abstract summary: Variance of Gradients (VoG) is a metric to rank data by difficulty.
We show that VoG is a valuable and efficient metric to rank data by difficulty.
- Score: 5.69361786082969
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In machine learning, a question of great interest is understanding what
examples are challenging for a model to classify. Identifying atypical examples
ensures the safe deployment of models, isolates samples that require further
human inspection and provides interpretability into model behavior. In this
work, we propose Variance of Gradients (VoG) as a valuable and efficient metric
to rank data by difficulty and to surface a tractable subset of the most
challenging examples for human-in-the-loop auditing. We show that data points
with high VoG scores are far more difficult for the model to learn and
over-index on corrupted or memorized examples. Further, restricting the
evaluation to the test set instances with the lowest VoG improves the model's
generalization performance. Finally, we show that VoG is a valuable and
efficient ranking for out-of-distribution detection.
Related papers
- Hybrid Classification-Regression Adaptive Loss for Dense Object Detection [19.180514552400883]
We propose a Hybrid Classification-Regression Adaptive Loss, termed as HCRAL.
We introduce the Residual of Classification and IoU (RCI) module for cross-task supervision, addressing task inconsistencies, and the Conditioning Factor (CF) to focus on difficult-to-train samples within each task.
We also introduce a new strategy named Expanded Adaptive Training Sample Selection (EATSS) to provide additional samples that exhibit classification and regression inconsistencies.
arXiv Detail & Related papers (2024-08-30T10:31:39Z) - Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models [36.05242956018461]
In this paper, we establish a bridge between identifying detrimental training samples via influence functions and outlier gradient detection.
We first validate the hypothesis of our proposed outlier gradient analysis approach on synthetic datasets.
We then demonstrate its effectiveness in detecting mislabeled samples in vision models and selecting data samples for improving performance of natural language processing transformer models.
arXiv Detail & Related papers (2024-05-06T21:34:46Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories.
We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG.
Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z) - HardVis: Visual Analytics to Handle Instance Hardness Using Undersampling and Oversampling Techniques [48.82319198853359]
HardVis is a visual analytics system designed to handle instance hardness mainly in imbalanced classification scenarios.
Users can explore subsets of data from different perspectives to decide all those parameters.
The efficacy and effectiveness of HardVis are demonstrated with a hypothetical usage scenario and a use case.
arXiv Detail & Related papers (2022-03-29T17:04:16Z) - Entropy optimized semi-supervised decomposed vector-quantized
variational autoencoder model based on transfer learning for multiclass text
classification and generation [3.9318191265352196]
We propose a semisupervised discrete latent variable model for multi-class text classification and text generation.
The proposed model employs the concept of transfer learning for training a quantized transformer model.
Experimental results indicate that the proposed model has surpassed the state-of-the-art models remarkably.
arXiv Detail & Related papers (2021-11-10T07:07:54Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.