Related papers: Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection

Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection

URL: http://arxiv.org/abs/2602.04607v1
Date: Wed, 04 Feb 2026 14:34:30 GMT
Title: Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection
Authors: Junhao Liu, Haonan Yu, Zhenyu Yan, Xin Zhang,
Abstract summary: Focus-LIME is a coarse-to-fine framework designed to restore the tractability of surgical interpretation.<n>Our method makes surgical explanations practicable and provides faithful explanations to users.
Score: 9.796641194900749
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As Large Language Models (LLMs) scale to handle massive context windows, achieving surgical feature-level interpretation is essential for high-stakes tasks like legal auditing and code debugging. However, existing local model-agnostic explanation methods face a critical dilemma in these scenarios: feature-based methods suffer from attribution dilution due to high feature dimensionality, thus failing to provide faithful explanations. In this paper, we propose Focus-LIME, a coarse-to-fine framework designed to restore the tractability of surgical interpretation. Focus-LIME utilizes a proxy model to curate the perturbation neighborhood, allowing the target model to perform fine-grained attribution exclusively within the optimized context. Empirical evaluations on long-context benchmarks demonstrate that our method makes surgical explanations practicable and provides faithful explanations to users.

Related papers

Who Judges the Judge? Evaluating LLM-as-a-Judge for French Medical open-ended QA [5.328379818938021]
We evaluate whether large language models (LLMs) can act as judges of semantic equivalence in French medical OEQA.<n>Our results show that LLM-based judgments are strongly influenced by the model that generated the answer.
arXiv Detail & Related papers (2026-03-04T13:12:30Z)
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models [122.58252919699122]
Mechanistic Interpretability (MI) has emerged as a vital approach to demystify the decision-making of Large Language Models (LLMs)<n>We present a practical survey structured around the pipeline: "Awesomeinterventionable-MI-Survey"
arXiv Detail & Related papers (2026-01-20T14:23:23Z)
Connecting the Dots: Training-Free Visual Grounding via Agentic Reasoning [63.109585527799005]
GroundingAgent is a visual grounding framework that operates without task-specific fine-tuning.<n>It achieves an average zero-shot grounding accuracy of 65.1 % on widely-used benchmarks.<n>It also offers strong interpretability, transparently illustrating each reasoning step.
arXiv Detail & Related papers (2025-11-24T03:11:08Z)
ImCoref-CeS: An Improved Lightweight Pipeline for Coreference Resolution with LLM-based Checker-Splitter Refinement [45.01372641622595]
We present ImCoref-CeS, a novel framework that integrates an enhanced supervised model with Large Language Models (LLMs)-based reasoning.<n>First, we present an improved CR method (textbfImCoref) to push the performance boundaries of the supervised neural method.<n>We employ an LLM acting as a multi-role Checker-Splitter agent to validate candidate mentions and coreference results.
arXiv Detail & Related papers (2025-10-11T14:48:08Z)
Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance. Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z)
Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation [56.87049651707208]
Few-shot Semantic has evolved into In-context tasks, morphing into a crucial element in assessing generalist segmentation models. Our initial focus lies in understanding how to facilitate interaction between the query image and the support image, resulting in the proposal of a KV fusion method within the self-attention framework. Based on our analysis, we establish a simple and effective framework named DiffewS, maximally retaining the original Latent Diffusion Model's generative framework.
arXiv Detail & Related papers (2024-10-03T10:33:49Z)
Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer [6.880129372917993]
We evaluate four object-centric approaches for domain generalization, establishing baseline performance. We develop an optimized method specifically tailored for domain generalization, LG-DG, that includes a novel disentanglement loss function. Our optimized approach, LG-DG, achieves an improvement of 9.28% over the best baseline approach.
arXiv Detail & Related papers (2024-03-11T17:36:11Z)
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data.<n>However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z)
On the Tip of the Tongue: Analyzing Conceptual Representation in Large Language Models with Reverse-Dictionary Probe [36.65834065044746]
We use in-context learning to guide the models to generate the term for an object concept implied in a linguistic description. Experiments suggest that conceptual inference ability as probed by the reverse-dictionary task predicts model's general reasoning performance.
arXiv Detail & Related papers (2024-02-22T09:45:26Z)
Coherent Entity Disambiguation via Modeling Topic and Categorical Dependency [87.16283281290053]
Previous entity disambiguation (ED) methods adopt a discriminative paradigm, where prediction is made based on matching scores between mention context and candidate entities. We propose CoherentED, an ED system equipped with novel designs aimed at enhancing the coherence of entity predictions. We achieve new state-of-the-art results on popular ED benchmarks, with an average improvement of 1.3 F1 points.
arXiv Detail & Related papers (2023-11-06T16:40:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.