Learning Interpretable Concept-Based Models with Human Feedback
- URL: http://arxiv.org/abs/2012.02898v1
- Date: Fri, 4 Dec 2020 23:41:05 GMT
- Title: Learning Interpretable Concept-Based Models with Human Feedback
- Authors: Isaac Lage, Finale Doshi-Velez
- Abstract summary: We propose an approach for learning a set of transparent concept definitions in high-dimensional data that relies on users labeling concept features.
Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model.
- Score: 36.65337734891338
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models that first learn a representation of a domain in
terms of human-understandable concepts, then use it to make predictions, have
been proposed to facilitate interpretation and interaction with models trained
on high-dimensional data. However these methods have important limitations: the
way they define concepts are not inherently interpretable, and they assume that
concept labels either exist for individual instances or can easily be acquired
from users. These limitations are particularly acute for high-dimensional
tabular features. We propose an approach for learning a set of transparent
concept definitions in high-dimensional tabular data that relies on users
labeling concept features instead of individual instances. Our method produces
concepts that both align with users' intuitive sense of what a concept means,
and facilitate prediction of the downstream label by a transparent machine
learning model. This ensures that the full model is transparent and intuitive,
and as predictive as possible given this constraint. We demonstrate with
simulated user feedback on real prediction problems, including one in a
clinical domain, that this kind of direct feedback is much more efficient at
learning solutions that align with ground truth concept definitions than
alternative transparent approaches that rely on labeling instances or other
existing interaction mechanisms, while maintaining similar predictive
performance.
Related papers
- Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions [7.3784937557132855]
Concept-based models (CBM) learn interpretable concepts from high-dimensional data, e.g. images, which are used to predict labels.
An important issue in CBMs is concept leakage, i.e., spurious information in the learned concepts, which effectively leads to learning "wrong" concepts.
We describe a framework that provides theoretical guarantees on the correctness of the learned concepts and on the number of required labels.
arXiv Detail & Related papers (2025-02-10T15:01:56Z) - Diverse Concept Proposals for Concept Bottleneck Models [23.395270888378594]
Concept bottleneck models are interpretable predictive models that are often used in domains where model trust is a key priority, such as healthcare.
Our proposed approach identifies a number of predictive concepts that explain the data.
By offering multiple alternative explanations, we allow the human expert to choose the one that best aligns with their expectation.
arXiv Detail & Related papers (2024-12-24T00:12:34Z) - Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks.
We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm.
Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z) - Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models [57.86303579812877]
Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions.
Existing approaches often require numerous human interventions per image to achieve strong performances.
We introduce a trainable concept realignment intervention module, which leverages concept relations to realign concept assignments post-intervention.
arXiv Detail & Related papers (2024-05-02T17:59:01Z) - Can we Constrain Concept Bottleneck Models to Learn Semantically Meaningful Input Features? [0.6401548653313325]
Concept Bottleneck Models (CBMs) are regarded as inherently interpretable because they first predict a set of human-defined concepts.
Current literature suggests that concept predictions often rely on irrelevant input features.
In this paper, we demonstrate that CBMs can learn to map concepts to semantically meaningful input features.
arXiv Detail & Related papers (2024-02-01T10:18:43Z) - Beyond Concept Bottleneck Models: How to Make Black Boxes Intervenable? [8.391254800873599]
We introduce a method to perform concept-based interventions on pretrained neural networks, which are not interpretable by design.
We formalise the notion of intervenability as a measure of the effectiveness of concept-based interventions and leverage this definition to fine-tune black boxes.
arXiv Detail & Related papers (2024-01-24T16:02:14Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Concept Gradient: Concept-based Interpretation Without Linear Assumption [77.96338722483226]
Concept Activation Vector (CAV) relies on learning a linear relation between some latent representation of a given model and concepts.
We proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions.
We demonstrated CG outperforms CAV in both toy examples and real world datasets.
arXiv Detail & Related papers (2022-08-31T17:06:46Z) - Discovering Concepts in Learned Representations using Statistical
Inference and Interactive Visualization [0.76146285961466]
Concept discovery is important for bridging the gap between non-deep learning experts and model end-users.
Current approaches include hand-crafting concept datasets and then converting them to latent space directions.
In this study, we offer another two approaches to guide user discovery of meaningful concepts, one based on multiple hypothesis testing, and another on interactive visualization.
arXiv Detail & Related papers (2022-02-09T22:29:48Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.