Related papers: Contrastive Error Attribution for Finetuned Language Models

Contrastive Error Attribution for Finetuned Language Models

URL: http://arxiv.org/abs/2212.10722v2
Date: Tue, 11 Jul 2023 17:06:19 GMT
Title: Contrastive Error Attribution for Finetuned Language Models
Authors: Faisal Ladhak, Esin Durmus, Tatsunori Hashimoto
Abstract summary: noisy and misannotated data is a core cause of hallucinations and unfaithful outputs in Natural Language Generation (NLG) tasks. We introduce a framework to identify and remove low-quality training instances that lead to undesirable outputs. We show that existing approaches for error tracing, such as gradient-based influence measures, do not perform reliably for detecting faithfulness errors.
Score: 35.80256755393739
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work has identified noisy and misannotated data as a core cause of hallucinations and unfaithful outputs in Natural Language Generation (NLG) tasks. Consequently, identifying and removing these examples is a key open challenge in creating reliable NLG systems. In this work, we introduce a framework to identify and remove low-quality training instances that lead to undesirable outputs, such as faithfulness errors in text summarization. We show that existing approaches for error tracing, such as gradient-based influence measures, do not perform reliably for detecting faithfulness errors in NLG datasets. We overcome the drawbacks of existing error tracing methods through a new, contrast-based estimate that compares undesired generations to human-corrected outputs. Our proposed method can achieve a mean average precision of 0.93 at detecting known data errors across synthetic tasks with known ground truth, substantially outperforming existing approaches. Using this approach and re-training models on cleaned data leads to a 70% reduction in entity hallucinations on the NYT dataset and a 55% reduction in semantic errors on the E2E dataset.

Related papers

Negation-Aware Test-Time Adaptation for Vision-Language Models [26.043679706381646]
We study a practical but less-touched problem in Vision-Language Models (VLMs)<n>Many real-world applications require models to explicitly identify what is false or non-existent.<n>We propose a Negation-Aware Test-Time Adaptation (NEAT) method to efficiently adjust distribution-related parameters during inference.
arXiv Detail & Related papers (2025-07-25T08:25:48Z)
Daunce: Data Attribution through Uncertainty Estimation [7.809316632545256]
Training data attribution methods aim to identify which training examples influence a model's predictions on specific test data most.<n> Gradient-based TDA methods rely on gradients and second-order information, limiting their applicability at scale.<n>We introduce Daunce - a simple yet effective data attribution approach through uncertainty estimation.
arXiv Detail & Related papers (2025-05-29T08:08:38Z)
SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training [12.745160748376794]
We propose a soft deduplication method that maintains dataset integrity while selectively reducing the sampling weight of data with high commonness. Central to our approach is the concept of "data commonness", a metric we introduce to quantify the degree of duplication. Empirical analysis shows that this method significantly improves training efficiency, achieving comparable perplexity scores with at least a 26% reduction in required training steps.
arXiv Detail & Related papers (2024-07-09T08:26:39Z)
Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection [9.784793380119806]
We introduce DIAG, a training-free Diffusion-based In-distribution Anomaly Generation pipeline for data augmentation. Unlike conventional image generation techniques, we implement a human-in-the-loop pipeline, where domain experts provide multimodal guidance to the model. We demonstrate the efficacy and versatility of DIAG with respect to state-of-the-art data augmentation approaches on the challenging KSDD2 dataset.
arXiv Detail & Related papers (2024-07-04T14:28:52Z)
Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z)
Towards Integration of Discriminability and Robustness for Document-Level Relation Extraction [41.51148745387936]
Document-level relation extraction (DocRE) predicts relations for entity pairs that rely on long-range context-dependent reasoning in a document. In this work, we aim to achieve better integration of both the discriminability and robustness for the DocRE problem. We innovatively customize entropy minimization and supervised contrastive learning for the challenging multi-label and long-tailed learning problems.
arXiv Detail & Related papers (2023-04-03T09:11:18Z)
Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method. We develop practical bounds to apply it to language generation. We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z)
Detecting Label Errors using Pre-Trained Language Models [37.82128817976385]
We show that large pre-trained language models are extremely capable of identifying label errors in datasets. We contribute a novel method to produce highly realistic, human-originated label noise from crowdsourced data, and demonstrate the effectiveness of this method on TweetNLP.
arXiv Detail & Related papers (2022-05-25T11:59:39Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data. The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Annotating and Modeling Fine-grained Factuality in Summarization [36.88018450067003]
A major barrier to their use in practice is their propensity to output summaries that are not faithful to the input and that contain factual errors. We explore both synthetic and human-labeled data sources for training models to identify factual errors in summarization. We show that our best factuality detection model enables training of more factual XSum summarization models by allowing us to identify non-factual tokens in the training data.
arXiv Detail & Related papers (2021-04-09T11:20:44Z)
A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction [54.569707226277735]
Existing approaches for grammatical error correction (GEC) rely on supervised learning with manually created GEC datasets. There is a non-negligible amount of "noise" where errors were inappropriately edited or left uncorrected. We propose a self-refinement method where the key idea is to denoise these datasets by leveraging the prediction consistency of existing models.
arXiv Detail & Related papers (2020-10-07T04:45:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.