Label-free Concept Based Multiple Instance Learning for Gigapixel Histopathology
- URL: http://arxiv.org/abs/2501.02922v1
- Date: Mon, 06 Jan 2025 11:03:04 GMT
- Title: Label-free Concept Based Multiple Instance Learning for Gigapixel Histopathology
- Authors: Susu Sun, Leslie Tessier, Frédérique Meeuwsen, Clément Grisi, Dominique van Midden, Geert Litjens, Christian F. Baumgartner,
- Abstract summary: Multiple Instance Learning (MIL) methods allow for gigapixel Whole-Slide Image (WSI) analysis with only slide-level annotations.<n>We propose a novel inherently interpretable WSI-classification approach that uses human-understandable pathology concepts to generate explanations.
- Score: 3.3353153750791718
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Multiple Instance Learning (MIL) methods allow for gigapixel Whole-Slide Image (WSI) analysis with only slide-level annotations. Interpretability is crucial for safely deploying such algorithms in high-stakes medical domains. Traditional MIL methods offer explanations by highlighting salient regions. However, such spatial heatmaps provide limited insights for end users. To address this, we propose a novel inherently interpretable WSI-classification approach that uses human-understandable pathology concepts to generate explanations. Our proposed Concept MIL model leverages recent advances in vision-language models to directly predict pathology concepts based on image features. The model's predictions are obtained through a linear combination of the concepts identified on the top-K patches of a WSI, enabling inherent explanations by tracing each concept's influence on the prediction. In contrast to traditional concept-based interpretable models, our approach eliminates the need for costly human annotations by leveraging the vision-language model. We validate our method on two widely used pathology datasets: Camelyon16 and PANDA. On both datasets, Concept MIL achieves AUC and accuracy scores over 0.9, putting it on par with state-of-the-art models. We further find that 87.1\% (Camelyon16) and 85.3\% (PANDA) of the top 20 patches fall within the tumor region. A user study shows that the concepts identified by our model align with the concepts used by pathologists, making it a promising strategy for human-interpretable WSI classification.
Related papers
- PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.
Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.
Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z) - Prototype-Based Multiple Instance Learning for Gigapixel Whole Slide Image Classification [3.9817342822800916]
ProtoMIL is an inherently interpretable MIL model for whole slide image (WSI) analysis.
Our approach employs a sparse autoencoder to discover human-interpretable concepts from the image feature space.
ProtoMIL allows users to perform model interventions by altering the input concepts.
arXiv Detail & Related papers (2025-03-11T12:44:03Z) - Interactive Medical Image Analysis with Concept-based Similarity Reasoning [32.38056136570339]
Concept-based Similarity Reasoning network (CSR) provides patch-level prototype with intrinsic concept interpretation.
CSR improves upon prior state-of-the-art interpretable methods by up to 4.5% across three biomedical datasets.
arXiv Detail & Related papers (2025-03-10T02:52:47Z) - Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models [5.985204759362746]
We present a unified framework for transforming any vision neural network into a spatially and conceptually interpretable model.
We name this method "Spatially-Aware and Label-Free Concept Bottleneck Model" (SALF-CBM)
arXiv Detail & Related papers (2025-02-27T14:27:55Z) - Sparks of Explainability: Recent Advancements in Explaining Large Vision Models [6.1642231492615345]
This thesis explores advanced approaches to improve explainability in computer vision by analyzing and modeling the features exploited by deep neural networks.
It evaluates attribution methods, notably saliency maps, by introducing a metric based on algorithmic stability and an approach utilizing Sobol indices.
Two hypotheses are examined: aligning models with human reasoning and adopting a conceptual explainability approach.
arXiv Detail & Related papers (2025-02-03T04:49:32Z) - A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis [6.6635650150737815]
Concept Bottleneck Models (CBMs) offer inherent interpretability by constraining the final disease prediction on a set of human-understandable concepts.
We introduce a novel two-step methodology that addresses both of these challenges.
By simulating the two stages of a CBM, we utilize a pretrained Vision Language Model (VLM) to automatically predict clinical concepts, and a Large Language Model (LLM) to generate disease diagnoses.
arXiv Detail & Related papers (2024-11-08T14:52:42Z) - WEEP: A method for spatial interpretation of weakly supervised CNN models in computational pathology [0.36096289461554343]
We propose a novel method, Wsi rEgion sElection aPproach (WEEP), for model interpretation.
We demonstrate WEEP on a binary classification task in the area of breast cancer computational pathology.
arXiv Detail & Related papers (2024-03-22T14:32:02Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - MAUVE Scores for Generative Models: Theory and Practice [95.86006777961182]
We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images.
We find that MAUVE can quantify the gaps between the distributions of human-written text and those of modern neural language models.
We demonstrate in the vision domain that MAUVE can identify known properties of generated images on par with or better than existing metrics.
arXiv Detail & Related papers (2022-12-30T07:37:40Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Using StyleGAN for Visual Interpretability of Deep Learning Models on
Medical Images [0.7874708385247352]
We propose a new interpretability method that can be used to understand the predictions of any black-box model on images.
A StyleGAN is trained on medical images to provide a mapping between latent vectors and images.
By shifting the latent representation of an input image along this direction, we can produce a series of new synthetic images with changed predictions.
arXiv Detail & Related papers (2021-01-19T11:13:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.