Rationalization through Concepts
- URL: http://arxiv.org/abs/2105.04837v1
- Date: Tue, 11 May 2021 07:46:48 GMT
- Title: Rationalization through Concepts
- Authors: Diego Antognini and Boi Faltings
- Abstract summary: We present a novel self-interpretable model called ConRAT.
Inspired by how human explanations for high-level decisions are often based on key concepts, ConRAT infers which ones are described in the document.
Two regularizers drive ConRAT to build interpretable concepts.
- Score: 27.207067974031805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated predictions require explanations to be interpretable by humans. One
type of explanation is a rationale, i.e., a selection of input features such as
relevant text snippets from which the model computes the outcome. However, a
single overall selection does not provide a complete explanation, e.g.,
weighing several aspects for decisions. To this end, we present a novel
self-interpretable model called ConRAT. Inspired by how human explanations for
high-level decisions are often based on key concepts, ConRAT extracts a set of
text snippets as concepts and infers which ones are described in the document.
Then, it explains the outcome with a linear aggregation of concepts. Two
regularizers drive ConRAT to build interpretable concepts. In addition, we
propose two techniques to boost the rationale and predictive performance
further. Experiments on both single- and multi-aspect sentiment classification
tasks show that ConRAT is the first to generate concepts that align with human
rationalization while using only the overall label. Further, it outperforms
state-of-the-art methods trained on each aspect label independently.
Related papers
- Explaining Explainability: Understanding Concept Activation Vectors [35.37586279472797]
Recent interpretability methods propose using concept-based explanations to translate internal representations of deep learning models into a language that humans are familiar with: concepts.
This requires understanding which concepts are present in the representation space of a neural network.
In this work, we investigate three properties of Concept Activation Vectors (CAVs), which are learnt using a probe dataset of concept exemplars.
We introduce tools designed to detect the presence of these properties, provide insight into how they affect the derived explanations, and provide recommendations to minimise their impact.
arXiv Detail & Related papers (2024-04-04T17:46:20Z) - Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering [58.64831511644917]
We introduce an interpretable by design model that factors model decisions into intermediate human-legible explanations.
We show that our inherently interpretable system can improve 4.64% over a comparable black-box system in reasoning-focused questions.
arXiv Detail & Related papers (2023-05-24T08:33:15Z) - HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale
Supervision [118.0818807474809]
This work proposes a principled, probabilistic approach for training explainable multi-hop QA systems without rationale supervision.
Our approach performs multi-hop reasoning by explicitly modeling rationales as sets, enabling the model to capture interactions between documents and sentences within a document.
arXiv Detail & Related papers (2023-05-23T16:53:49Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Overlooked factors in concept-based explanations: Dataset choice,
concept learnability, and human capability [25.545486537295144]
Concept-based interpretability methods aim to explain deep neural network model predictions using a predefined set of semantic concepts.
Despite their popularity, they suffer from limitations that are not well-understood and articulated by the literature.
We analyze three commonly overlooked factors in concept-based explanations.
arXiv Detail & Related papers (2022-07-20T01:59:39Z) - Expressive Explanations of DNNs by Combining Concept Analysis with ILP [0.3867363075280543]
We use inherent features learned by the network to build a global, expressive, verbal explanation of the rationale of a feed-forward convolutional deep neural network (DNN)
We show that our explanation is faithful to the original black-box model.
arXiv Detail & Related papers (2021-05-16T07:00:27Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Generating Commonsense Explanation by Extracting Bridge Concepts from
Reasoning Paths [128.13034600968257]
We propose a method that first extracts the underlying concepts which are served as textitbridges in the reasoning chain.
To facilitate the reasoning process, we utilize external commonsense knowledge to build the connection between a statement and the bridge concepts.
We design a bridge concept extraction model that first scores the triples, routes the paths in the subgraph, and further selects bridge concepts with weak supervision.
arXiv Detail & Related papers (2020-09-24T15:27:20Z) - Exploring Explainable Selection to Control Abstractive Summarization [51.74889133688111]
We develop a novel framework that focuses on explainability.
A novel pair-wise matrix captures the sentence interactions, centrality, and attribute scores.
A sentence-deployed attention mechanism in the abstractor ensures the final summary emphasizes the desired content.
arXiv Detail & Related papers (2020-04-24T14:39:34Z) - Generating Hierarchical Explanations on Text Classification via Feature
Interaction Detection [21.02924712220406]
We build hierarchical explanations by detecting feature interactions.
Such explanations visualize how words and phrases are combined at different levels of the hierarchy.
Experiments show the effectiveness of the proposed method in providing explanations both faithful to models and interpretable to humans.
arXiv Detail & Related papers (2020-04-04T20:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.