Dual Attention Model with Reinforcement Learning for Classification of Histology Whole-Slide Images
- URL: http://arxiv.org/abs/2302.09682v2
- Date: Thu, 21 Nov 2024 16:29:08 GMT
- Title: Dual Attention Model with Reinforcement Learning for Classification of Histology Whole-Slide Images
- Authors: Manahil Raza, Ruqayya Awan, Raja Muhammad Saad Bashir, Talha Qaiser, Nasir M. Rajpoot,
- Abstract summary: Digital whole slide images (WSIs) are generally captured at microscopic resolution and encompass extensive spatial data.
We propose a novel dual attention approach, consisting of two main components, both inspired by the visual examination process of a pathologist.
We show that the proposed model achieves performance better than or comparable to the state-of-the-art methods while processing less than 10% of the WSI at the highest magnification.
- Score: 8.404881822414898
- License:
- Abstract: Digital whole slide images (WSIs) are generally captured at microscopic resolution and encompass extensive spatial data. Directly feeding these images to deep learning models is computationally intractable due to memory constraints, while downsampling the WSIs risks incurring information loss. Alternatively, splitting the WSIs into smaller patches may result in a loss of important contextual information. In this paper, we propose a novel dual attention approach, consisting of two main components, both inspired by the visual examination process of a pathologist: The first soft attention model processes a low magnification view of the WSI to identify relevant regions of interest, followed by a custom sampling method to extract diverse and spatially distinct image tiles from the selected ROIs. The second component, the hard attention classification model further extracts a sequence of multi-resolution glimpses from each tile for classification. Since hard attention is non-differentiable, we train this component using reinforcement learning to predict the location of the glimpses. This approach allows the model to focus on essential regions instead of processing the entire tile, thereby aligning with a pathologist's way of diagnosis. The two components are trained in an end-to-end fashion using a joint loss function to demonstrate the efficacy of the model. The proposed model was evaluated on two WSI-level classification problems: Human epidermal growth factor receptor 2 scoring on breast cancer histology images and prediction of Intact/Loss status of two Mismatch Repair biomarkers from colorectal cancer histology images. We show that the proposed model achieves performance better than or comparable to the state-of-the-art methods while processing less than 10% of the WSI at the highest magnification and reducing the time required to infer the WSI-level label by more than 75%.
Related papers
- Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Active Learning Enhances Classification of Histopathology Whole Slide
Images with Attention-based Multiple Instance Learning [48.02011627390706]
We train an attention-based MIL and calculate a confidence metric for every image in the dataset to select the most uncertain WSIs for expert annotation.
With a novel attention guiding loss, this leads to an accuracy boost of the trained models with few regions annotated for each class.
It may in the future serve as an important contribution to train MIL models in the clinically relevant context of cancer classification in histopathology.
arXiv Detail & Related papers (2023-03-02T15:18:58Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Joint localization and classification of breast tumors on ultrasound
images using a novel auxiliary attention-based framework [7.6620616780444974]
We propose a novel joint localization and classification model based on the attention mechanism and disentangled semi-supervised learning strategy.
The proposed modularized framework allows flexible network replacement to be generalized for various applications.
arXiv Detail & Related papers (2022-10-11T20:14:13Z) - Dynamic Sub-Cluster-Aware Network for Few-Shot Skin Disease
Classification [31.539129126161978]
This paper introduces a novel approach called the Sub-Cluster-Aware Network (SCAN) that enhances accuracy in diagnosing rare skin diseases.
The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple sub-clusters.
We evaluate the proposed approach on two public datasets for few-shot skin disease classification.
arXiv Detail & Related papers (2022-07-03T16:06:04Z) - Mixed-UNet: Refined Class Activation Mapping for Weakly-Supervised
Semantic Segmentation with Multi-scale Inference [28.409679398886304]
We develop a novel model named Mixed-UNet, which has two parallel branches in the decoding phase.
We evaluate the designed Mixed-UNet against several prevalent deep learning-based segmentation approaches on our dataset collected from the local hospital and public datasets.
arXiv Detail & Related papers (2022-05-06T08:37:02Z) - Pay Attention with Focus: A Novel Learning Scheme for Classification of
Whole Slide Images [8.416553728391309]
We propose a novel two-stage approach to analyze whole slide images (WSIs)
First, we extract a set of representative patches (called mosaic) from a WSI.
Each patch of a mosaic is encoded to a feature vector using a deep network.
In the second stage, a set of encoded patch-level features from a WSI is used to compute the primary diagnosis probability.
arXiv Detail & Related papers (2021-06-11T21:59:02Z) - An End-to-End Breast Tumour Classification Model Using Context-Based
Patch Modelling- A BiLSTM Approach for Image Classification [19.594639581421422]
We have tried to integrate this relationship along with feature-based correlation among the extracted patches from the particular tumorous region.
We trained and tested our model on two datasets, microscopy images and WSI tumour regions.
We found out that BiLSTMs with CNN features have performed much better in modelling patches into an end-to-end Image classification network.
arXiv Detail & Related papers (2021-06-05T10:43:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.