Related papers: Towards Budget-Friendly Model-Agnostic Explanation Generation for Large Language Models

Towards Budget-Friendly Model-Agnostic Explanation Generation for Large Language Models

URL: http://arxiv.org/abs/2505.12509v1
Date: Sun, 18 May 2025 18:05:37 GMT
Title: Towards Budget-Friendly Model-Agnostic Explanation Generation for Large Language Models
Authors: Junhao Liu, Haonan Yu, Xin Zhang,
Abstract summary: We show that it is practical to generate faithful explanations for large-scale Large language models by sampling from some budget-friendly models.<n>Our analysis provides a new paradigm of model-agnostic explanation methods for LLMs, by including information from budget-friendly models.
Score: 14.110188927768736
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With Large language models (LLMs) becoming increasingly prevalent in various applications, the need for interpreting their predictions has become a critical challenge. As LLMs vary in architecture and some are closed-sourced, model-agnostic techniques show great promise without requiring access to the model's internal parameters. However, existing model-agnostic techniques need to invoke LLMs many times to gain sufficient samples for generating faithful explanations, which leads to high economic costs. In this paper, we show that it is practical to generate faithful explanations for large-scale LLMs by sampling from some budget-friendly models through a series of empirical studies. Moreover, we show that such proxy explanations also perform well on downstream tasks. Our analysis provides a new paradigm of model-agnostic explanation methods for LLMs, by including information from budget-friendly models.

Related papers

Learning to Explain: Prototype-Based Surrogate Models for LLM Classification [1.7373859011890633]
Large language models (LLMs) have demonstrated impressive performance on natural language tasks, but their decision-making processes remain largely opaque.<n>We propose textbfProtoSurE, a prototype-based surrogate framework that provides faithful and human-understandable explanations.
arXiv Detail & Related papers (2025-05-25T04:25:28Z)
Can LLMs Explain Themselves Counterfactually? [16.569180690291773]
Explanations are an important tool for gaining insights into the behavior of ML models.<n>We study a specific type of self-explanations, self-generated counterfactual explanations (SCEs)
arXiv Detail & Related papers (2025-02-25T12:40:41Z)
Applying Large Language Models in Knowledge Graph-based Enterprise Modeling: Challenges and Opportunities [0.0]
Large language models (LLMs) in enterprise modeling have recently started to shift from academic research to that of industrial applications.<n>In this paper we employ a knowledge graph-based approach for enterprise modeling and investigate the potential benefits of LLMs.
arXiv Detail & Related papers (2025-01-07T06:34:17Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community. There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z)
Data Science with LLMs and Interpretable Models [19.4969442162327]
Large language models (LLMs) are remarkably good at working with interpretable models. We show that LLMs can describe, interpret, and debug Generalized Additive Models (GAMs)
arXiv Detail & Related papers (2024-02-22T12:04:15Z)
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks. However, the massive size of these models poses huge challenges for their deployment in real-world applications. We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z)
In-Context Explainers: Harnessing LLMs for Explaining Black Box Models [28.396104334980492]
Large Language Models (LLMs) have demonstrated exceptional capabilities in complex tasks like machine translation, commonsense reasoning, and language understanding. One of the primary reasons for the adaptability of LLMs in such diverse tasks is their in-context learning (ICL) capability, which allows them to perform well on new tasks by simply using a few task samples in the prompt. We propose a novel framework, In-Context Explainers, comprising of three novel approaches that exploit the ICL capabilities of LLMs to explain the predictions made by other predictive models.
arXiv Detail & Related papers (2023-10-09T15:31:03Z)
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust. Existing methods often fall short of explaining model predictions effectively or efficiently. We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z)
Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z)
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code. At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes. We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.