Towards Concept-based Interpretability of Skin Lesion Diagnosis using
Vision-Language Models
- URL: http://arxiv.org/abs/2311.14339v2
- Date: Wed, 6 Mar 2024 14:23:38 GMT
- Title: Towards Concept-based Interpretability of Skin Lesion Diagnosis using
Vision-Language Models
- Authors: Cristiano Patr\'icio, Lu\'is F. Teixeira, Jo\~ao C. Neves
- Abstract summary: We show that vision-language models can be used to alleviate the dependence on a large number of concept-annotated samples.
In particular, we propose an embedding learning strategy to adapt CLIP to the downstream task of skin lesion classification using concept-based descriptions as textual embeddings.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Concept-based models naturally lend themselves to the development of
inherently interpretable skin lesion diagnosis, as medical experts make
decisions based on a set of visual patterns of the lesion. Nevertheless, the
development of these models depends on the existence of concept-annotated
datasets, whose availability is scarce due to the specialized knowledge and
expertise required in the annotation process. In this work, we show that
vision-language models can be used to alleviate the dependence on a large
number of concept-annotated samples. In particular, we propose an embedding
learning strategy to adapt CLIP to the downstream task of skin lesion
classification using concept-based descriptions as textual embeddings. Our
experiments reveal that vision-language models not only attain better accuracy
when using concepts as textual embeddings, but also require a smaller number of
concept-annotated samples to attain comparable performance to approaches
specifically devised for automatic concept generation.
Related papers
- A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis [6.6635650150737815]
Concept Bottleneck Models (CBMs) offer inherent interpretability by constraining the final disease prediction on a set of human-understandable concepts.
We introduce a novel two-step methodology that addresses both of these challenges.
By simulating the two stages of a CBM, we utilize a pretrained Vision Language Model (VLM) to automatically predict clinical concepts, and a Large Language Model (LLM) to generate disease diagnoses.
arXiv Detail & Related papers (2024-11-08T14:52:42Z) - Concept Complement Bottleneck Model for Interpretable Medical Image Diagnosis [8.252227380729188]
We propose a concept complement bottleneck model for interpretable medical image diagnosis.
We propose to use concept adapters for specific concepts to mine the concept differences and score concepts in their own attention channels.
Our model outperforms the state-of-the-art competitors in concept detection and disease diagnosis tasks.
arXiv Detail & Related papers (2024-10-20T16:52:09Z) - Aligning Human Knowledge with Visual Concepts Towards Explainable Medical Image Classification [8.382606243533942]
We introduce a simple yet effective framework, Explicd, towards Explainable language-informed criteria-based diagnosis.
By leveraging a pretrained vision-language model, Explicd injects these criteria into the embedding space as knowledge anchors.
The final diagnostic outcome is determined based on the similarity scores between the encoded visual concepts and the textual criteria embeddings.
arXiv Detail & Related papers (2024-06-08T23:23:28Z) - MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level
Image-Concept Alignment [4.861768967055006]
We propose a multi-modal explainable disease diagnosis framework that meticulously aligns medical images and clinical-related concepts semantically at multiple strata.
Our method, while preserving model interpretability, attains high performance and label efficiency for concept detection and disease diagnosis.
arXiv Detail & Related papers (2024-01-16T17:45:01Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - Coherent Concept-based Explanations in Medical Image and Its Application
to Skin Lesion Diagnosis [0.0]
Existing deep learning approaches for melanoma skin lesion diagnosis are deemed black-box models.
We propose an inherently interpretable framework to improve the interpretability of concept-based models.
Our method outperforms existing black-box and concept-based models for skin lesion classification.
arXiv Detail & Related papers (2023-04-10T13:32:04Z) - Learnable Visual Words for Interpretable Image Recognition [70.85686267987744]
We propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules.
The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories.
Our experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation.
arXiv Detail & Related papers (2022-05-22T03:24:45Z) - Translational Concept Embedding for Generalized Compositional Zero-shot
Learning [73.60639796305415]
Generalized compositional zero-shot learning means to learn composed concepts of attribute-object pairs in a zero-shot fashion.
This paper introduces a new approach, termed translational concept embedding, to solve these two difficulties in a unified framework.
arXiv Detail & Related papers (2021-12-20T21:27:51Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Concept Bottleneck Models [79.91795150047804]
State-of-the-art models today do not typically support the manipulation of concepts like "the existence of bone spurs"
We revisit the classic idea of first predicting concepts that are provided at training time, and then using these concepts to predict the label.
On x-ray grading and bird identification, concept bottleneck models achieve competitive accuracy with standard end-to-end models.
arXiv Detail & Related papers (2020-07-09T07:47:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.