Related papers: Learn to explain yourself, when you can: Equipping Concept Bottleneck Models with the ability to abstain on their concept predictions

Learn to explain yourself, when you can: Equipping Concept Bottleneck Models with the ability to abstain on their concept predictions

URL: http://arxiv.org/abs/2211.11690v1
Date: Mon, 21 Nov 2022 18:07:14 GMT
Title: Learn to explain yourself, when you can: Equipping Concept Bottleneck Models with the ability to abstain on their concept predictions
Authors: Joshua Lockhart, Daniele Magazzeni, Manuela Veloso
Abstract summary: We show how to equip a neural network based classifier with the ability to abstain from predicting concepts when the concept labeling component is uncertain. Our model learns to provide rationales for its predictions, but only whenever it is sure the rationale is correct.
Score: 21.94901195358998
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Concept Bottleneck Models (CBMs) of Koh et al. [2020] provide a means to ensure that a neural network based classifier bases its predictions solely on human understandable concepts. The concept labels, or rationales as we refer to them, are learned by the concept labeling component of the CBM. Another component learns to predict the target classification label from these predicted concept labels. Unfortunately, these models are heavily reliant on human provided concept labels for each datapoint. To enable CBMs to behave robustly when these labels are not readily available, we show how to equip them with the ability to abstain from predicting concepts when the concept labeling component is uncertain. In other words, our model learns to provide rationales for its predictions, but only whenever it is sure the rationale is correct.

Related papers

Zero-shot Concept Bottleneck Models [17.70684428339905]
Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models. We present textitzero-shot concept bottleneck models (Z-CBMs), which predict concepts and labels in a fully zero-shot manner without training neural networks.
arXiv Detail & Related papers (2025-02-13T07:11:07Z)
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks. We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm. Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z)
Semi-supervised Concept Bottleneck Models [9.875244481114489]
We propose a new framework called SSCBM (Semi-supervised Concept Bottleneck Model) Our SSCBM is suitable for practical situations where annotated data is scarce. With only 20% labeled data, we achieved 93.19% concept accuracy and 75.51% (79.82% in a fully supervised setting) prediction accuracy.
arXiv Detail & Related papers (2024-06-27T08:33:35Z)
On the Concept Trustworthiness in Concept Bottleneck Models [39.928868605678744]
Concept Bottleneck Models (CBMs) break down the reasoning process into the input-to-concept mapping and the concept-to-label prediction. Despite the transparency of the concept-to-label prediction, the mapping from the input to the intermediate concept remains a black box. A pioneering metric, referred to as concept trustworthiness score, is proposed to gauge whether the concepts are derived from relevant regions. An enhanced CBM is introduced, enabling concept predictions to be made specifically from distinct parts of the feature map.
arXiv Detail & Related papers (2024-03-21T12:24:53Z)
Can we Constrain Concept Bottleneck Models to Learn Semantically Meaningful Input Features? [0.6401548653313325]
Concept Bottleneck Models (CBMs) are regarded as inherently interpretable because they first predict a set of human-defined concepts. Current literature suggests that concept predictions often rely on irrelevant input features. In this paper, we demonstrate that CBMs can learn to map concepts to semantically meaningful input features.
arXiv Detail & Related papers (2024-02-01T10:18:43Z)
Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations [15.23014992362639]
Concept bottleneck models (CBMs) have been successful in providing concept-based interpretations for black-box deep learning models. We propose Energy-based Concept Bottleneck Models (ECBMs) Our ECBMs use a set of neural networks to define the joint energy of candidate (input, concept, class) quantifications.
arXiv Detail & Related papers (2024-01-25T12:46:37Z)
Simple Mechanisms for Representing, Indexing and Manipulating Concepts [46.715152257557804]
We will argue that learning a concept could be done by looking at its moment statistics matrix to generate a concrete representation or signature of that concept. When the concepts are intersected', signatures of the concepts can be used to find a common theme across a number of related intersected' concepts.
arXiv Detail & Related papers (2023-10-18T17:54:29Z)
Towards learning to explain with concept bottleneck models: mitigating information leakage [19.52933192442871]
Concept bottleneck models perform classification by first predicting which of a list of human provided concepts are true about a datapoint. A downstream model uses these predicted concept labels to predict the target label. The predicted concepts act as a rationale for the target prediction.
arXiv Detail & Related papers (2022-11-07T16:10:36Z)
Concept Activation Regions: A Generalized Framework For Concept-Based Explanations [95.94432031144716]
Existing methods assume that the examples illustrating a concept are mapped in a fixed direction of the deep neural network's latent space. In this work, we propose allowing concept examples to be scattered across different clusters in the DNN's latent space. This concept activation region (CAR) formalism yields global concept-based explanations and local concept-based feature importance.
arXiv Detail & Related papers (2022-09-22T17:59:03Z)
Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts. We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z)
Automatic Concept Extraction for Concept Bottleneck-based Video Classification [58.11884357803544]
We present an automatic Concept Discovery and Extraction module that rigorously composes a necessary and sufficient set of concept abstractions for concept-based video classification. Our method elicits inherent complex concept abstractions in natural language to generalize concept-bottleneck methods to complex tasks.
arXiv Detail & Related papers (2022-06-21T06:22:35Z)
Concept Bottleneck Models [79.91795150047804]
State-of-the-art models today do not typically support the manipulation of concepts like "the existence of bone spurs" We revisit the classic idea of first predicting concepts that are provided at training time, and then using these concepts to predict the label. On x-ray grading and bird identification, concept bottleneck models achieve competitive accuracy with standard end-to-end models.
arXiv Detail & Related papers (2020-07-09T07:47:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.