Related papers: A Diagnostic Study of Explainability Techniques for Text Classification

A Diagnostic Study of Explainability Techniques for Text Classification

URL: http://arxiv.org/abs/2009.13295v1
Date: Fri, 25 Sep 2020 12:01:53 GMT
Title: A Diagnostic Study of Explainability Techniques for Text Classification
Authors: Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein
Abstract summary: We develop a list of diagnostic properties for evaluating existing explainability techniques. We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
Score: 52.879658637466605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural complexity. Efforts to make the rationales behind the models' predictions transparent have inspired an abundance of new explainability techniques. Provided with an already trained model, they compute saliency scores for the words of an input instance. However, there exists no definitive guide on (i) how to choose such a technique given a particular application task and model architecture, and (ii) the benefits and drawbacks of using each such technique. In this paper, we develop a comprehensive list of diagnostic properties for evaluating existing explainability techniques. We then employ the proposed list to compare a set of diverse explainability techniques on downstream text classification tasks and neural network architectures. We also compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones. Overall, we find that the gradient-based explanations perform best across tasks and model architectures, and we present further insights into the properties of the reviewed explainability techniques.

Related papers

Interpretable Image Classification via Non-parametric Part Prototype Learning [14.390730075612248]
Classifying images with an interpretable decision-making process is a long-standing problem in computer vision. In recent years, Prototypical Part Networks has gained traction as an approach for self-explainable neural networks. We present a framework for part-based interpretable image classification that learns a set of semantically distinctive object parts for each class.
arXiv Detail & Related papers (2025-03-13T10:46:53Z)
How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations [69.72654127617058]
Post-hoc importance attribution methods are a popular tool for "explaining" Deep Neural Networks (DNNs) In this work we bring forward empirical evidence that challenges this very notion. We discover a strong dependency on and demonstrate that the training details of a pre-trained model's classification layer play a crucial role.
arXiv Detail & Related papers (2025-03-01T22:25:11Z)
CNN-based explanation ensembling for dataset, representation and explanations evaluation [1.1060425537315088]
We explore the potential of ensembling explanations generated by deep classification models using convolutional model. Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior.
arXiv Detail & Related papers (2024-04-16T08:39:29Z)
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales [3.242050660144211]
Saliency post-hoc explainability methods are important tools for understanding increasingly complex NLP models. We present a methodology for incorporating rationales, which are text annotations explaining human decisions, into text classification models.
arXiv Detail & Related papers (2024-04-03T22:39:33Z)
What Does Evaluation of Explainable Artificial Intelligence Actually Tell Us? A Case for Compositional and Contextual Validation of XAI Building Blocks [16.795332276080888]
We propose a fine-grained validation framework for explainable artificial intelligence systems. We recognise their inherent modular structure: technical building blocks, user-facing explanatory artefacts and social communication protocols.
arXiv Detail & Related papers (2024-03-19T13:45:34Z)
How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored. Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges. We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z)
A Recursive Bateson-Inspired Model for the Generation of Semantic Formal Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data. The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept. The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z)
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction. RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Beyond Explaining: Opportunities and Challenges of XAI-Based Model Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models. This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models. We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z)
Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations. We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z)
Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff [0.0]
We explore the current art of explainability and interpretability within a case study in clinical text classification. We demonstrate various visualization techniques for fully interpretable methods as well as model-agnostic post hoc attributions. We introduce a framework through which practitioners and researchers can assess the frontier between a model's predictive performance and the quality of its available explanations.
arXiv Detail & Related papers (2021-07-12T19:07:24Z)
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity. However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power. In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique. We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.