Interpretation Quality Score for Measuring the Quality of
interpretability methods
- URL: http://arxiv.org/abs/2205.12254v1
- Date: Tue, 24 May 2022 17:57:55 GMT
- Title: Interpretation Quality Score for Measuring the Quality of
interpretability methods
- Authors: Yuansheng Xie, Soroush Vosoughi, Saeed Hassanpour
- Abstract summary: There currently exists no widely-accepted metric to evaluate the quality of explanations generated by interpretability methods.
We propose a novel metric for quantifying the quality of explanations generated by interpretability methods.
We compute the metric on three NLP tasks using six interpretability methods and present our results.
- Score: 12.659475399995717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) models have been applied to a wide range of natural
language processing (NLP) tasks in recent years. In addition to making accurate
decisions, the necessity of understanding how models make their decisions has
become apparent in many applications. To that end, many interpretability
methods that help explain the decision processes of ML models have been
developed. Yet, there currently exists no widely-accepted metric to evaluate
the quality of explanations generated by these methods. As a result, there
currently is no standard way of measuring to what degree an interpretability
method achieves an intended objective. Moreover, there is no accepted standard
of performance by which we can compare and rank the current existing
interpretability methods. In this paper, we propose a novel metric for
quantifying the quality of explanations generated by interpretability methods.
We compute the metric on three NLP tasks using six interpretability methods and
present our results.
Related papers
- Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models [76.17975723711886]
Uncertainty quantification (UQ) is a prominent approach for eliciting truthful answers from large language models (LLMs)
In this work, we adapt Mahalanobis Distance (MD) - a well-established UQ technique in classification tasks - for text generation.
Our method extracts token embeddings from multiple layers of LLMs, computes MD scores for each token, and uses linear regression trained on these features to provide robust uncertainty scores.
arXiv Detail & Related papers (2025-02-20T10:25:13Z) - BEExAI: Benchmark to Evaluate Explainable AI [0.9176056742068812]
We propose BEExAI, a benchmark tool that allows large-scale comparison of different post-hoc XAI methods.
We argue that the need for a reliable way of measuring the quality and correctness of explanations is becoming critical.
arXiv Detail & Related papers (2024-07-29T11:21:17Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - Self-Evaluation Improves Selective Generation in Large Language Models [54.003992911447696]
We reformulate open-ended generation tasks into token-level prediction tasks.
We instruct an LLM to self-evaluate its answers.
We benchmark a range of scoring methods based on self-evaluation.
arXiv Detail & Related papers (2023-12-14T19:09:22Z) - BLESS: Benchmarking Large Language Models on Sentence Simplification [55.461555829492866]
We present BLESS, a performance benchmark of the most recent state-of-the-art large language models (LLMs) on the task of text simplification (TS)
We assess a total of 44 models, differing in size, architecture, pre-training methods, and accessibility, on three test sets from different domains (Wikipedia, news, and medical) under a few-shot setting.
Our evaluation indicates that the best LLMs, despite not being trained on TS, perform comparably with state-of-the-art TS baselines.
arXiv Detail & Related papers (2023-10-24T12:18:17Z) - A global analysis of metrics used for measuring performance in natural
language processing [9.433496814327086]
We provide the first large-scale cross-sectional analysis of metrics used for measuring performance in natural language processing.
Results suggest that the large majority of natural language processing metrics currently used have properties that may result in an inadequate reflection of a models' performance.
arXiv Detail & Related papers (2022-04-25T11:41:50Z) - Evaluation of post-hoc interpretability methods in time-series classification [0.6249768559720122]
We propose a framework with quantitative metrics to assess the performance of existing post-hoc interpretability methods.
We show that several drawbacks identified in the literature are addressed, namely dependence on human judgement, retraining, and shift in the data distribution when occluding samples.
The proposed methodology and quantitative metrics can be used to understand the reliability of interpretability methods results obtained in practical applications.
arXiv Detail & Related papers (2022-02-11T14:55:56Z) - On the Faithfulness Measurements for Model Interpretations [100.2730234575114]
Post-hoc interpretations aim to uncover how natural language processing (NLP) models make predictions.
To tackle these issues, we start with three criteria: the removal-based criterion, the sensitivity of interpretations, and the stability of interpretations.
Motivated by the desideratum of these faithfulness notions, we introduce a new class of interpretation methods that adopt techniques from the adversarial domain.
arXiv Detail & Related papers (2021-04-18T09:19:44Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - On quantitative aspects of model interpretability [0.0]
We argue that methods along these dimensions can be imputed to two conceptual parts, namely the extractor and the actual explainability method.
We experimentally validate our metrics on different benchmark tasks and show how they can be used to guide a practitioner in the selection of the most appropriate method for the task at hand.
arXiv Detail & Related papers (2020-07-15T10:05:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.