Related papers: Can we Constrain Concept Bottleneck Models to Learn Semantically Meaningful Input Features?

Can we Constrain Concept Bottleneck Models to Learn Semantically Meaningful Input Features?

URL: http://arxiv.org/abs/2402.00912v1
Date: Thu, 1 Feb 2024 10:18:43 GMT
Title: Can we Constrain Concept Bottleneck Models to Learn Semantically Meaningful Input Features?
Authors: Jack Furby, Daniel Cunnington, Dave Braines, Alun Preece
Abstract summary: Concept Bottleneck Models (CBMs) are considered inherently interpretable because they first predict a set of human-defined concepts. For inherent interpretability to be fully realised, we need to guarantee concepts are predicted based on semantically mapped input features. We demonstrate that CBMs can learn concept representations with semantic mapping to input features by removing problematic concept correlations.
Score: 0.6993232019625149
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Concept Bottleneck Models (CBMs) are considered inherently interpretable because they first predict a set of human-defined concepts before using these concepts to predict the output of a downstream task. For inherent interpretability to be fully realised, and ensure trust in a model's output, we need to guarantee concepts are predicted based on semantically mapped input features. For example, one might expect the pixels representing a broken bone in an image to be used for the prediction of a fracture. However, current literature indicates this is not the case, as concept predictions are often mapped to irrelevant input features. We hypothesise that this occurs when concept annotations are inaccurate or how input features should relate to concepts is unclear. In general, the effect of dataset labelling on concept representations in CBMs remains an understudied area. Therefore, in this paper, we examine how CBMs learn concepts from datasets with fine-grained concept annotations. We demonstrate that CBMs can learn concept representations with semantic mapping to input features by removing problematic concept correlations, such as two concepts always appearing together. To support our evaluation, we introduce a new synthetic image dataset based on a playing cards domain, which we hope will serve as a benchmark for future CBM research. For validation, we provide empirical evidence on a real-world dataset of chest X-rays, to demonstrate semantically meaningful concepts can be learned in real-world applications.

Related papers

I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [76.15163242945813]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence.<n>We introduce a novel generative model that generates tokens on the basis of human-interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
Zero-shot Concept Bottleneck Models [17.70684428339905]
Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models.<n>We present textitzero-shot concept bottleneck models (Z-CBMs), which predict concepts and labels in a fully zero-shot manner without training neural networks.
arXiv Detail & Related papers (2025-02-13T07:11:07Z)
MulCPred: Learning Multi-modal Concepts for Explainable Pedestrian Action Prediction [57.483718822429346]
MulCPred is proposed that explains its predictions based on multi-modal concepts represented by training samples. MulCPred is evaluated on multiple datasets and tasks.
arXiv Detail & Related papers (2024-09-14T14:15:28Z)
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks. We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm. Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z)
On the Concept Trustworthiness in Concept Bottleneck Models [39.928868605678744]
Concept Bottleneck Models (CBMs) break down the reasoning process into the input-to-concept mapping and the concept-to-label prediction. Despite the transparency of the concept-to-label prediction, the mapping from the input to the intermediate concept remains a black box. A pioneering metric, referred to as concept trustworthiness score, is proposed to gauge whether the concepts are derived from relevant regions. An enhanced CBM is introduced, enabling concept predictions to be made specifically from distinct parts of the feature map.
arXiv Detail & Related papers (2024-03-21T12:24:53Z)
Gaussian Mixture Models for Affordance Learning using Bayesian Networks [50.18477618198277]
Affordances are fundamental descriptors of relationships between actions, objects and effects. This paper approaches the problem of an embodied agent exploring the world and learning these affordances autonomously from its sensory experiences.
arXiv Detail & Related papers (2024-02-08T22:05:45Z)
Energy-Based Concept Bottleneck Models: Unifying Prediction, Concept Intervention, and Probabilistic Interpretations [15.23014992362639]
Concept bottleneck models (CBMs) have been successful in providing concept-based interpretations for black-box deep learning models. We propose Energy-based Concept Bottleneck Models (ECBMs) Our ECBMs use a set of neural networks to define the joint energy of candidate (input, concept, class) quantifications.
arXiv Detail & Related papers (2024-01-25T12:46:37Z)
Do Concept Bottleneck Models Respect Localities? [14.77558378567965]
Concept-based methods explain model predictions using human-understandable concepts. "Localities" involve using only relevant features when predicting a concept's value. CBMs may not capture localities, even when independent concepts are localised to non-overlapping feature subsets.
arXiv Detail & Related papers (2024-01-02T16:05:23Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
Learn to explain yourself, when you can: Equipping Concept Bottleneck Models with the ability to abstain on their concept predictions [21.94901195358998]
We show how to equip a neural network based classifier with the ability to abstain from predicting concepts when the concept labeling component is uncertain. Our model learns to provide rationales for its predictions, but only whenever it is sure the rationale is correct.
arXiv Detail & Related papers (2022-11-21T18:07:14Z)
Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts. We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z)
Learning Interpretable Concept-Based Models with Human Feedback [36.65337734891338]
We propose an approach for learning a set of transparent concept definitions in high-dimensional data that relies on users labeling concept features. Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model.
arXiv Detail & Related papers (2020-12-04T23:41:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.