Saliency Map Verbalization: Comparing Feature Importance Representations
from Model-free and Instruction-based Methods
- URL: http://arxiv.org/abs/2210.07222v3
- Date: Wed, 7 Jun 2023 09:29:04 GMT
- Title: Saliency Map Verbalization: Comparing Feature Importance Representations
from Model-free and Instruction-based Methods
- Authors: Nils Feldhus, Leonhard Hennig, Maximilian Dustin Nasert, Christopher
Ebert, Robert Schwarzenberg, Sebastian M\"oller
- Abstract summary: Saliency maps can explain a neural model's predictions by identifying important input features.
We formalize the underexplored task of translating saliency maps into natural language.
We compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations.
- Score: 6.018950511093273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Saliency maps can explain a neural model's predictions by identifying
important input features. They are difficult to interpret for laypeople,
especially for instances with many features. In order to make them more
accessible, we formalize the underexplored task of translating saliency maps
into natural language and compare methods that address two key challenges of
this approach -- what and how to verbalize. In both automatic and human
evaluation setups, using token-level attributions from text classification
tasks, we compare two novel methods (search-based and instruction-based
verbalizations) against conventional feature importance representations
(heatmap visualizations and extractive rationales), measuring simulatability,
faithfulness, helpfulness and ease of understanding. Instructing GPT-3.5 to
generate saliency map verbalizations yields plausible explanations which
include associations, abstractive summarization and commonsense reasoning,
achieving by far the highest human ratings, but they are not faithfully
capturing numeric information and are inconsistent in their interpretation of
the task. In comparison, our search-based, model-free verbalization approach
efficiently completes templated verbalizations, is faithful by design, but
falls short in helpfulness and simulatability. Our results suggest that
saliency map verbalization makes feature attribution explanations more
comprehensible and less cognitively challenging to humans than conventional
representations.
Related papers
- Main Predicate and Their Arguments as Explanation Signals For Intent Classification [41.09752906121257]
We show that the main verb denotes the action, and the direct object indicates the domain of conversation, serving as explanation signals for intent.
We mark main predicates (primarily verbs) and their arguments as explanation signals in benchmark intent classification datasets ATIS and SNIPS.
We observe that models that work well for classification do not perform well in explainability metrics like plausibility and faithfulness.
arXiv Detail & Related papers (2025-02-03T11:39:26Z) - A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks [10.181408678232055]
We introduce an evaluation methodology for reading comprehension tasks based on the intuition that certain examples consistently yield lower scores regardless of model size or architecture.
We capitalize on semantic frame annotation for characterizing this complexity, and study seven complexity factors that may account for model's difficulty.
arXiv Detail & Related papers (2025-01-29T11:05:20Z) - Discrete Subgraph Sampling for Interpretable Graph based Visual Question Answering [27.193336817953142]
We integrate different discrete subset sampling methods into a graph-based visual question answering system.
We show that the integrated methods effectively mitigate the performance trade-off between interpretability and answer accuracy.
We also conduct a human evaluation to assess the interpretability of the generated subgraphs.
arXiv Detail & Related papers (2024-12-11T10:18:37Z) - ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language [40.4052848203136]
Implicit language is essential for natural language processing systems to achieve precise text understanding and facilitate natural interactions with users.
This paper develops a scalar metric that quantifies the implicitness level of language without relying on external references.
We apply ImpScore to hate speech detection datasets, illustrating its utility and highlighting significant limitations in current large language models' ability to understand highly implicit content.
arXiv Detail & Related papers (2024-11-07T20:23:29Z) - Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - Interpreting Language Models with Contrastive Explanations [99.7035899290924]
Language models must consider various features to predict a token, such as its part of speech, number, tense, or semantics.
Existing explanation methods conflate evidence for all these features into a single explanation, which is less interpretable for human understanding.
We show that contrastive explanations are quantifiably better than non-contrastive explanations in verifying major grammatical phenomena.
arXiv Detail & Related papers (2022-02-21T18:32:24Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels.
We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Generating Hierarchical Explanations on Text Classification via Feature
Interaction Detection [21.02924712220406]
We build hierarchical explanations by detecting feature interactions.
Such explanations visualize how words and phrases are combined at different levels of the hierarchy.
Experiments show the effectiveness of the proposed method in providing explanations both faithful to models and interpretable to humans.
arXiv Detail & Related papers (2020-04-04T20:56:37Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.