Related papers: Compact Example-Based Explanations for Language Models

Compact Example-Based Explanations for Language Models

URL: http://arxiv.org/abs/2601.03786v1
Date: Wed, 07 Jan 2026 10:36:46 GMT
Title: Compact Example-Based Explanations for Language Models
Authors: Loris Schoenegger, Benjamin Roth,
Abstract summary: Training data influence estimation methods quantify the contribution of training documents to a model's output.<n>As humans cannot interpret thousands of documents, only a small subset of the training data can be presented as an explanation.<n>We propose a novel selection relevance score, a retraining-free metric that quantifies how useful a set of examples is for explaining a model's output.
Score: 1.8772057593980798
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training data influence estimation methods quantify the contribution of training documents to a model's output, making them a promising source of information for example-based explanations. As humans cannot interpret thousands of documents, only a small subset of the training data can be presented as an explanation. Although the choice of which documents to include directly affects explanation quality, previous evaluations of such systems have largely ignored any selection strategies. To address this, we propose a novel selection relevance score, a retraining-free metric that quantifies how useful a set of examples is for explaining a model's output. We validate this score through fine-tuning experiments, confirming that it can predict whether a set of examples supports or undermines the model's predictions. Using this metric, we further show that common selection strategies often underperform random selection. Motivated by this finding, we propose a strategy that balances influence and representativeness, enabling better use of selection budgets than naively selecting the highest-ranking examples.

Related papers

Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples [74.60723854735237]
We show that mislabeled examples correctly predicted by the model early in the training process are particularly harmful to model performance.<n>We propose Early Cutting, which employs the model's later training state to re-select the confident subset identified early in training.
arXiv Detail & Related papers (2025-02-12T09:12:45Z)
Scalable Influence and Fact Tracing for Large Language Model Pretraining [14.598556308631018]
Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples.<n>We refine existing gradient-based methods to work effectively at scale.<n>We release our prompt set and model outputs, along with a web-based visualization tool to explore influential examples.
arXiv Detail & Related papers (2024-10-22T20:39:21Z)
Label-Efficient Model Selection for Text Generation [14.61636207880449]
We introduce DiffUse, a method to make an informed decision between candidate text generation models based on preference annotations. In a series of experiments over hundreds of model pairs, we demonstrate that DiffUse can dramatically reduce the required number of annotations.
arXiv Detail & Related papers (2024-02-12T18:54:02Z)
Deep Neural Network Benchmarks for Selective Classification [27.098996474946446]
Multiple selective classification frameworks exist, most of which rely on deep neural network architectures. We evaluate these approaches using several criteria, including selective error rate, empirical coverage, distribution of rejected instance's classes, and performance on out-of-distribution instances.
arXiv Detail & Related papers (2024-01-23T12:15:47Z)
Revisiting Demonstration Selection Strategies in In-Context Learning [66.11652803887284]
Large language models (LLMs) have shown an impressive ability to perform a wide range of tasks using in-context learning (ICL) In this work, we first revisit the factors contributing to this variance from both data and model aspects, and find that the choice of demonstration is both data- and model-dependent. We propose a data- and model-dependent demonstration selection method, textbfTopK + ConE, based on the assumption that textitthe performance of a demonstration positively correlates with its contribution to the model's understanding of the test samples.
arXiv Detail & Related papers (2024-01-22T16:25:27Z)
One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets. We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z)
IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models [63.15355173909631]
This paper introduces an influence-driven selective annotation method.<n>It aims to minimize annotation costs while improving the quality of in-context examples.<n> Experiments confirm the superiority of the proposed method on various benchmarks.
arXiv Detail & Related papers (2023-10-16T22:53:54Z)
Towards Informative Few-Shot Prompt with Maximum Information Gain for In-Context Learning [30.536184852029386]
Large Language models (LLMs) possess the capability to engage In-context Learning (ICL) LLMs possess the capability to engage In-context Learning (ICL) by leveraging a few demonstrations pertaining to a new downstream task as conditions. However, this particular learning paradigm suffers from high instability stemming from substantial variances induced by factors such as the input distribution of selected examples, their ordering, and prompt formats.
arXiv Detail & Related papers (2023-10-13T07:49:11Z)
Finding Support Examples for In-Context Learning [73.90376920653507]
We propose LENS, a fiLter-thEN-Search method to tackle this challenge in two stages. First we filter the dataset to obtain informative in-context examples individually. Then we propose diversity-guided example search which iteratively refines and evaluates the selected example permutations.
arXiv Detail & Related papers (2023-02-27T06:32:45Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
Online Active Model Selection for Pre-trained Classifiers [72.84853880948894]
We design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round. Our algorithm can be used for online prediction tasks for both adversarial and streams.
arXiv Detail & Related papers (2020-10-19T19:53:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.