What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
- URL: http://arxiv.org/abs/2504.15815v1
- Date: Tue, 22 Apr 2025 11:53:33 GMT
- Title: What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
- Authors: Michael A. Hedderich, Anyi Wang, Raoyuan Zhao, Florian Eichin, Barbara Plank,
- Abstract summary: Spotlight is a new approach that combines both automation and human analysis.<n>Based on data mining techniques, we automatically distinguish between random (decoding) variations and systematic differences in language model outputs.<n>We show that our token pattern approach helps users understand the systematic differences of language model outputs.
- Score: 23.505782809734512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt engineering for large language models is challenging, as even small prompt perturbations or model changes can significantly impact the generated output texts. Existing evaluation methods, either automated metrics or human evaluation, have limitations, such as providing limited insights or being labor-intensive. We propose Spotlight, a new approach that combines both automation and human analysis. Based on data mining techniques, we automatically distinguish between random (decoding) variations and systematic differences in language model outputs. This process provides token patterns that describe the systematic differences and guide the user in manually analyzing the effects of their prompt and model changes efficiently. We create three benchmarks to quantitatively test the reliability of token pattern extraction methods and demonstrate that our approach provides new insights into established prompt data. From a human-centric perspective, through demonstration studies and a user study, we show that our token pattern approach helps users understand the systematic differences of language model outputs, and we are able to discover relevant differences caused by prompt and model changes (e.g. related to gender or culture), thus supporting the prompt engineering process and human-centric model behavior research.
Related papers
- The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models [40.128112851978116]
We study how different prompting methods affect the geometry of representations in language models.<n>Our analysis highlights the critical role of input distribution samples and label semantics in few-shot in-context learning.<n>Our work contributes to the theoretical understanding of large language models and lays the groundwork for developing more effective, representation-aware prompting strategies.
arXiv Detail & Related papers (2025-02-11T23:09:50Z) - LLMTemporalComparator: A Tool for Analysing Differences in Temporal Adaptations of Large Language Models [17.021220773165016]
This study addresses the challenges of analyzing temporal discrepancies in large language models (LLMs) trained on data from different time periods.
We propose a novel system that compares in a systematic way the outputs of two LLM versions based on user-defined queries.
arXiv Detail & Related papers (2024-10-05T15:17:07Z) - Revisiting Self-supervised Learning of Speech Representation from a
Mutual Information Perspective [68.20531518525273]
We take a closer look into existing self-supervised methods of speech from an information-theoretic perspective.
We use linear probes to estimate the mutual information between the target information and learned representations.
We explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels.
arXiv Detail & Related papers (2024-01-16T21:13:22Z) - Explaining Relation Classification Models with Semantic Extents [1.7604348079019634]
A lack of explainability is currently a complicating factor in many real-world applications.
We introduce semantic extents, a concept to analyze decision patterns for the relation classification task.
We provide an annotation tool and a software framework to determine semantic extents for humans and models.
arXiv Detail & Related papers (2023-08-04T08:17:52Z) - Interpretable Differencing of Machine Learning Models [20.99877540751412]
We formalize the problem of model differencing as one of predicting a dissimilarity function of two ML models' outputs.
A Joint Surrogate Tree (JST) is composed of two conjoined decision tree surrogates for the two models.
A JST provides an intuitive representation of differences and places the changes in the context of the models' decision logic.
arXiv Detail & Related papers (2023-06-10T16:15:55Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Empowering Language Understanding with Counterfactual Reasoning [141.48592718583245]
We propose a Counterfactual Reasoning Model, which mimics the counterfactual thinking by learning from few counterfactual samples.
In particular, we devise a generation module to generate representative counterfactual samples for each factual sample, and a retrospective module to retrospect the model prediction by comparing the counterfactual and factual samples.
arXiv Detail & Related papers (2021-06-06T06:36:52Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Explainable Recommender Systems via Resolving Learning Representations [57.24565012731325]
Explanations could help improve user experience and discover system defects.
We propose a novel explainable recommendation model through improving the transparency of the representation learning process.
arXiv Detail & Related papers (2020-08-21T05:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.