Context Dependence and Reliability in Autoregressive Language Models
- URL: http://arxiv.org/abs/2602.01378v1
- Date: Sun, 01 Feb 2026 18:25:44 GMT
- Title: Context Dependence and Reliability in Autoregressive Language Models
- Authors: Poushali Sengupta, Shashi Raj Pandey, Sabita Maharjan, Frank Eliassen,
- Abstract summary: In critical applications, it is vital to identify which context elements actually influence the output.<n>This work addresses the challenge of distinguishing essential context elements from correlated ones.<n>We introduce RISE, a method that quantifies the unique influence of each input relative to others, minimizing the impact of redundancies.
- Score: 4.9988239650406765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) generate outputs by utilizing extensive context, which often includes redundant information from prompts, retrieved passages, and interaction history. In critical applications, it is vital to identify which context elements actually influence the output, as standard explanation methods struggle with redundancy and overlapping context. Minor changes in input can lead to unpredictable shifts in attribution scores, undermining interpretability and raising concerns about risks like prompt injection. This work addresses the challenge of distinguishing essential context elements from correlated ones. We introduce RISE (Redundancy-Insensitive Scoring of Explanation), a method that quantifies the unique influence of each input relative to others, minimizing the impact of redundancies and providing clearer, stable attributions. Experiments demonstrate that RISE offers more robust explanations than traditional methods, emphasizing the importance of conditional information for trustworthy LLM explanations and monitoring.
Related papers
- Learning to Extract Context for Context-Aware LLM Inference [60.376872353918394]
User prompts to large language models (LLMs) are often ambiguous or under-specified.<n> contextual cues shaped by user intentions, prior knowledge, and risk factors influence what constitutes an appropriate response.<n>We propose a framework that extracts and leverages such contextual information from the user prompt itself.
arXiv Detail & Related papers (2025-12-12T19:10:08Z) - Explaining multimodal LLMs via intra-modal token interactions [55.27436637894534]
Multimodal Large Language Models (MLLMs) have achieved remarkable success across diverse vision-language tasks, yet their internal decision-making mechanisms remain insufficiently understood.<n>We propose enhancing interpretability by leveraging intra-modal interaction.
arXiv Detail & Related papers (2025-09-26T14:39:13Z) - Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts [55.70338710797578]
We introduce the Poisoned Context Testbed, pairing queries with real-world contexts containing relevant and inappropriate content.<n>Inspired by associative learning in animals, we adapt the Rescorla-Wagner (RW) model from neuroscience to quantify how competing contextual signals influence LLM outputs.<n>We introduce RW-Steering, a two-stage finetuning-based approach that enables the model to internally identify and ignore inappropriate signals.
arXiv Detail & Related papers (2025-09-02T00:40:34Z) - Reflect then Learn: Active Prompting for Information Extraction Guided by Introspective Confusion [41.79586757544166]
Large Language Models (LLMs) show remarkable potential for few-shot information extraction (IE)<n> Conventional selection strategies often fail to provide informative guidance, as they overlook a key source of model fallibility.<n>We introduce Active Prompting for Information Extraction (APIE), a novel active prompting framework guided by a principle we term introspective confusion.
arXiv Detail & Related papers (2025-08-10T02:27:41Z) - Hierarchical Interaction Summarization and Contrastive Prompting for Explainable Recommendations [9.082885521130617]
We propose a novel approach combining profile generation via hierarchical interaction summarization (PGHIS) with contrastive prompting for explanation generation (CPEG)<n>Our approach outperforms existing state-of-the-art methods, achieving a great improvement on metrics about explainability (e.g., 5% on GPTScore) and text quality.
arXiv Detail & Related papers (2025-07-08T14:45:47Z) - On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning.<n>We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning.<n>We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z) - Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.<n>We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z) - An Effective Deployment of Diffusion LM for Data Augmentation in Low-Resource Sentiment Classification [2.0930389307057427]
Sentiment classification (SC) often suffers from low-resource challenges such as domain-specific contexts, imbalanced label distributions, and few-shot scenarios.
We propose Diffusion LM to capture in-domain knowledge and generate pseudo samples by reconstructing strong label-related tokens.
arXiv Detail & Related papers (2024-09-05T02:51:28Z) - TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs [50.259001311894295]
We propose a novel TRansformer-based Attribution framework using Contrastive Embeddings called TRACE.
We show that TRACE significantly improves the ability to attribute sources accurately, making it a valuable tool for enhancing the reliability and trustworthiness of large language models.
arXiv Detail & Related papers (2024-07-06T07:19:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.