Generating Hierarchical Explanations on Text Classification via Feature
Interaction Detection
- URL: http://arxiv.org/abs/2004.02015v3
- Date: Mon, 18 May 2020 02:30:20 GMT
- Title: Generating Hierarchical Explanations on Text Classification via Feature
Interaction Detection
- Authors: Hanjie Chen, Guangtao Zheng, Yangfeng Ji
- Abstract summary: We build hierarchical explanations by detecting feature interactions.
Such explanations visualize how words and phrases are combined at different levels of the hierarchy.
Experiments show the effectiveness of the proposed method in providing explanations both faithful to models and interpretable to humans.
- Score: 21.02924712220406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating explanations for neural networks has become crucial for their
applications in real-world with respect to reliability and trustworthiness. In
natural language processing, existing methods usually provide important
features which are words or phrases selected from an input text as an
explanation, but ignore the interactions between them. It poses challenges for
humans to interpret an explanation and connect it to model prediction. In this
work, we build hierarchical explanations by detecting feature interactions.
Such explanations visualize how words and phrases are combined at different
levels of the hierarchy, which can help users understand the decision-making of
black-box models. The proposed method is evaluated with three neural text
classifiers (LSTM, CNN, and BERT) on two benchmark datasets, via both automatic
and human evaluations. Experiments show the effectiveness of the proposed
method in providing explanations that are both faithful to models and
interpretable to humans.
Related papers
- Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach [10.54430941755474]
This paper proposes a post-hoc natural language explanation method that can be applied to any CNN-based classification system.
By analysing influential neurons and the corresponding activation maps, the method generates a faithful description of the classifier's decision process.
Experimental results show that the NLEs constructed by our method are significantly more plausible and faithful.
arXiv Detail & Related papers (2024-07-30T15:17:15Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Explaining Interactions Between Text Spans [50.70253702800355]
Reasoning over spans of tokens from different parts of the input is essential for natural language understanding.
We introduce SpanEx, a dataset of human span interaction explanations for two NLU tasks: NLI and FC.
We then investigate the decision-making processes of multiple fine-tuned large language models in terms of the employed connections between spans.
arXiv Detail & Related papers (2023-10-20T13:52:37Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Saliency Map Verbalization: Comparing Feature Importance Representations
from Model-free and Instruction-based Methods [6.018950511093273]
Saliency maps can explain a neural model's predictions by identifying important input features.
We formalize the underexplored task of translating saliency maps into natural language.
We compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations.
arXiv Detail & Related papers (2022-10-13T17:48:15Z) - A Unified Understanding of Deep NLP Models for Text Classification [88.35418976241057]
We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification.
The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample.
A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples.
arXiv Detail & Related papers (2022-06-19T08:55:07Z) - Hierarchical Interpretation of Neural Text Classification [31.95426448656938]
This paper proposes a novel Hierarchical INTerpretable neural text classifier, called Hint, which can automatically generate explanations of model predictions.
Experimental results on both review datasets and news datasets show that our proposed approach achieves text classification results on par with existing state-of-the-art text classifiers.
arXiv Detail & Related papers (2022-02-20T11:15:03Z) - Explaining Neural Network Predictions on Sentence Pairs via Learning
Word-Group Masks [21.16662651409811]
We propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together.
The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets.
arXiv Detail & Related papers (2021-04-09T17:14:34Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - ALICE: Active Learning with Contrastive Natural Language Explanations [69.03658685761538]
We propose Active Learning with Contrastive Explanations (ALICE) to improve data efficiency in learning.
ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations.
It extracts knowledge from these explanations using a semantically extracted knowledge.
arXiv Detail & Related papers (2020-09-22T01:02:07Z) - LIMEtree: Interactively Customisable Explanations Based on Local
Surrogate Multi-output Regression Trees [21.58324172085553]
We introduce a model-agnostic and post-hoc local explainability technique for black-box predictions called LIMEtree.
We validate our algorithm on a deep neural network trained for object detection in images and compare it against Local Interpretable Model-agnostic Explanations (LIME)
Our method comes with local fidelity guarantees and can produce a range of diverse explanation types.
arXiv Detail & Related papers (2020-05-04T12:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.