Related papers: Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection

Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection

URL: http://arxiv.org/abs/2004.02015v3
Date: Mon, 18 May 2020 02:30:20 GMT
Title: Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection
Authors: Hanjie Chen, Guangtao Zheng, Yangfeng Ji
Abstract summary: We build hierarchical explanations by detecting feature interactions. Such explanations visualize how words and phrases are combined at different levels of the hierarchy. Experiments show the effectiveness of the proposed method in providing explanations both faithful to models and interpretable to humans.
Score: 21.02924712220406
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating explanations for neural networks has become crucial for their applications in real-world with respect to reliability and trustworthiness. In natural language processing, existing methods usually provide important features which are words or phrases selected from an input text as an explanation, but ignore the interactions between them. It poses challenges for humans to interpret an explanation and connect it to model prediction. In this work, we build hierarchical explanations by detecting feature interactions. Such explanations visualize how words and phrases are combined at different levels of the hierarchy, which can help users understand the decision-making of black-box models. The proposed method is evaluated with three neural text classifiers (LSTM, CNN, and BERT) on two benchmark datasets, via both automatic and human evaluations. Experiments show the effectiveness of the proposed method in providing explanations that are both faithful to models and interpretable to humans.

Related papers

Verified Language Processing with Hybrid Explainability: A Technical Report [0.7066382982173529]
We present a novel pipeline designed for hybrid explainability to address this.<n>Our methodology combines graphs and logic to produce First-Order Logic representations, creating machine- and human-readable representations through Montague Grammar.<n>Preliminary results indicate the effectiveness of this approach in capturing full text similarity.
arXiv Detail & Related papers (2025-07-07T14:00:05Z)
Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach [10.54430941755474]
This paper proposes a post-hoc natural language explanation method that can be applied to any CNN-based classification system. By analysing influential neurons and the corresponding activation maps, the method generates a faithful description of the classifier's decision process. Experimental results show that the NLEs constructed by our method are significantly more plausible and faithful.
arXiv Detail & Related papers (2024-07-30T15:17:15Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Explaining Interactions Between Text Spans [50.70253702800355]
Reasoning over spans of tokens from different parts of the input is essential for natural language understanding. We introduce SpanEx, a dataset of human span interaction explanations for two NLU tasks: NLI and FC. We then investigate the decision-making processes of multiple fine-tuned large language models in terms of the employed connections between spans.
arXiv Detail & Related papers (2023-10-20T13:52:37Z)
TExplain: Explaining Learned Visual Features via Pre-trained (Frozen) Language Models [14.019349267520541]
We propose a novel method that leverages the capabilities of language models to interpret the learned features of pre-trained image classifiers. Our approach generates a vast number of sentences to explain the features learned by the classifier for a given image. Our method, for the first time, utilizes these frequent words corresponding to a visual representation to provide insights into the decision-making process.
arXiv Detail & Related papers (2023-09-01T20:59:46Z)
Textual Entailment Recognition with Semantic Features from Empirical Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text. In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis. We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z)
Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods [6.018950511093273]
Saliency maps can explain a neural model's predictions by identifying important input features. We formalize the underexplored task of translating saliency maps into natural language. We compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations.
arXiv Detail & Related papers (2022-10-13T17:48:15Z)
A Unified Understanding of Deep NLP Models for Text Classification [88.35418976241057]
We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification. The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample. A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples.
arXiv Detail & Related papers (2022-06-19T08:55:07Z)
Hierarchical Interpretation of Neural Text Classification [31.95426448656938]
This paper proposes a novel Hierarchical INTerpretable neural text classifier, called Hint, which can automatically generate explanations of model predictions. Experimental results on both review datasets and news datasets show that our proposed approach achieves text classification results on par with existing state-of-the-art text classifiers.
arXiv Detail & Related papers (2022-02-20T11:15:03Z)
Explaining Neural Network Predictions on Sentence Pairs via Learning Word-Group Masks [21.16662651409811]
We propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together. The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets.
arXiv Detail & Related papers (2021-04-09T17:14:34Z)
Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models. Our method is based on projecting model representation to a latent space. Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z)
ALICE: Active Learning with Contrastive Natural Language Explanations [69.03658685761538]
We propose Active Learning with Contrastive Explanations (ALICE) to improve data efficiency in learning. ALICE learns to first use active learning to select the most informative pairs of label classes to elicit contrastive natural language explanations. It extracts knowledge from these explanations using a semantically extracted knowledge.
arXiv Detail & Related papers (2020-09-22T01:02:07Z)
LIMEtree: Interactively Customisable Explanations Based on Local Surrogate Multi-output Regression Trees [21.58324172085553]
We introduce a model-agnostic and post-hoc local explainability technique for black-box predictions called LIMEtree. We validate our algorithm on a deep neural network trained for object detection in images and compare it against Local Interpretable Model-agnostic Explanations (LIME) Our method comes with local fidelity guarantees and can produce a range of diverse explanation types.
arXiv Detail & Related papers (2020-05-04T12:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.