Diverse Concept Proposals for Concept Bottleneck Models
- URL: http://arxiv.org/abs/2412.18059v1
- Date: Tue, 24 Dec 2024 00:12:34 GMT
- Title: Diverse Concept Proposals for Concept Bottleneck Models
- Authors: Katrina Brown, Marton Havasi, Finale Doshi-Velez,
- Abstract summary: Concept bottleneck models are interpretable predictive models that are often used in domains where model trust is a key priority, such as healthcare.
Our proposed approach identifies a number of predictive concepts that explain the data.
By offering multiple alternative explanations, we allow the human expert to choose the one that best aligns with their expectation.
- Score: 23.395270888378594
- License:
- Abstract: Concept bottleneck models are interpretable predictive models that are often used in domains where model trust is a key priority, such as healthcare. They identify a small number of human-interpretable concepts in the data, which they then use to make predictions. Learning relevant concepts from data proves to be a challenging task. The most predictive concepts may not align with expert intuition, thus, failing interpretability with no recourse. Our proposed approach identifies a number of predictive concepts that explain the data. By offering multiple alternative explanations, we allow the human expert to choose the one that best aligns with their expectation. To demonstrate our method, we show that it is able discover all possible concept representations on a synthetic dataset. On EHR data, our model was able to identify 4 out of the 5 pre-defined concepts without supervision.
Related papers
- An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity.
We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z) - Estimation of Concept Explanations Should be Uncertainty Aware [39.598213804572396]
We study a specific kind called Concept Explanations, where the goal is to interpret a model using human-understandable concepts.
Although popular for their easy interpretation, concept explanations are known to be noisy.
We propose an uncertainty-aware Bayesian estimation method to address these issues, which readily improved the quality of explanations.
arXiv Detail & Related papers (2023-12-13T11:17:27Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Uncovering Unique Concept Vectors through Latent Space Decomposition [0.0]
Concept-based explanations have emerged as a superior approach that is more interpretable than feature attribution estimates.
We propose a novel post-hoc unsupervised method that automatically uncovers the concepts learned by deep models during training.
Our experiments reveal that the majority of our concepts are readily understandable to humans, exhibit coherency, and bear relevance to the task at hand.
arXiv Detail & Related papers (2023-07-13T17:21:54Z) - Statistically Significant Concept-based Explanation of Image Classifiers
via Model Knockoffs [22.576922942465142]
Concept-based explanations may cause false positives, which misregards unrelated concepts as important for the prediction task.
We propose a method using a deep learning model to learn the image concept and then using the Knockoff samples to select the important concepts for prediction.
arXiv Detail & Related papers (2023-05-27T05:40:05Z) - Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts.
We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions.
We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z) - Overlooked factors in concept-based explanations: Dataset choice,
concept learnability, and human capability [25.545486537295144]
Concept-based interpretability methods aim to explain deep neural network model predictions using a predefined set of semantic concepts.
Despite their popularity, they suffer from limitations that are not well-understood and articulated by the literature.
We analyze three commonly overlooked factors in concept-based explanations.
arXiv Detail & Related papers (2022-07-20T01:59:39Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z) - Learning Interpretable Concept-Based Models with Human Feedback [36.65337734891338]
We propose an approach for learning a set of transparent concept definitions in high-dimensional data that relies on users labeling concept features.
Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model.
arXiv Detail & Related papers (2020-12-04T23:41:05Z) - Concept Bottleneck Models [79.91795150047804]
State-of-the-art models today do not typically support the manipulation of concepts like "the existence of bone spurs"
We revisit the classic idea of first predicting concepts that are provided at training time, and then using these concepts to predict the label.
On x-ray grading and bird identification, concept bottleneck models achieve competitive accuracy with standard end-to-end models.
arXiv Detail & Related papers (2020-07-09T07:47:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.