ClassifyViStA:WCE Classification with Visual understanding through Segmentation and Attention
- URL: http://arxiv.org/abs/2412.18591v1
- Date: Tue, 24 Dec 2024 18:45:14 GMT
- Title: ClassifyViStA:WCE Classification with Visual understanding through Segmentation and Attention
- Authors: S. Balasubramanian, Ammu Abhishek, Yedu Krishna, Darshan Gera,
- Abstract summary: We propose ClassifyViStA, an AI-based framework designed for the automated detection and classification of bleeding and non-bleeding frames from WCE videos.
The model is built upon an ensemble of ResNet18 and VGG16 architectures to enhance classification performance.
Our approach not only automates the detection of GI bleeding but also provides an interpretable solution that can ease the burden on healthcare professionals.
- Score: 3.887356044145916
- License:
- Abstract: Gastrointestinal (GI) bleeding is a serious medical condition that presents significant diagnostic challenges, particularly in settings with limited access to healthcare resources. Wireless Capsule Endoscopy (WCE) has emerged as a powerful diagnostic tool for visualizing the GI tract, but it requires time-consuming manual analysis by experienced gastroenterologists, which is prone to human error and inefficient given the increasing number of patients.To address this challenge, we propose ClassifyViStA, an AI-based framework designed for the automated detection and classification of bleeding and non-bleeding frames from WCE videos. The model consists of a standard classification path, augmented by two specialized branches: an implicit attention branch and a segmentation branch.The attention branch focuses on the bleeding regions, while the segmentation branch generates accurate segmentation masks, which are used for classification and interpretability. The model is built upon an ensemble of ResNet18 and VGG16 architectures to enhance classification performance. For the bleeding region detection, we implement a Soft Non-Maximum Suppression (Soft NMS) approach with YOLOv8, which improves the handling of overlapping bounding boxes, resulting in more accurate and nuanced detections.The system's interpretability is enhanced by using the segmentation masks to explain the classification results, offering insights into the decision-making process similar to the way a gastroenterologist identifies bleeding regions. Our approach not only automates the detection of GI bleeding but also provides an interpretable solution that can ease the burden on healthcare professionals and improve diagnostic efficiency. Our code is available at ClassifyViStA.
Related papers
- Divide and Conquer: Grounding a Bleeding Areas in Gastrointestinal Image with Two-Stage Model [7.1083241462091165]
This study proposes a two-stage framework that decouples classification and grounding to address the inherent challenges posed by traditional Multi-Task Learning models.
Experimental results demonstrate significant improvements in classification accuracy and segmentation precision.
arXiv Detail & Related papers (2024-12-21T18:18:12Z) - Agent Aggregator with Mask Denoise Mechanism for Histopathology Whole Slide Image Analysis [6.708196053187949]
Histopathology analysis is the gold standard for medical diagnosis. Accurate classification of whole slide images (WSIs) and region-of-interests (ROIs) localization can assist pathologists in diagnosis.
In weakly supervised learning, multiple instance learning (MIL) presents a promising approach for WSI classification.
We propose AMD-MIL, an agent aggregator with a mask denoise mechanism.
arXiv Detail & Related papers (2024-09-18T03:02:19Z) - Multi-task Explainable Skin Lesion Classification [54.76511683427566]
We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
arXiv Detail & Related papers (2023-10-11T05:49:47Z) - Class Attention to Regions of Lesion for Imbalanced Medical Image
Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues.
The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.
Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z) - Weakly Supervised Intracranial Hemorrhage Segmentation using Head-Wise
Gradient-Infused Self-Attention Maps from a Swin Transformer in Categorical
Learning [0.6269243524465492]
Intracranial hemorrhage (ICH) is a life-threatening medical emergency that requires timely diagnosis and accurate treatment.
Deep learning techniques have emerged as the leading approach for medical image analysis and processing.
We introduce a novel weakly supervised method for ICH segmentation, utilizing a Swin transformer trained on an ICH classification task with categorical labels.
arXiv Detail & Related papers (2023-04-11T00:17:34Z) - Interpretable Diabetic Retinopathy Diagnosis based on Biomarker
Activation Map [2.6170980960630037]
We introduce a novel biomarker activation map (BAM) framework based on generative adversarial learning.
A data set including 456 macular scans were graded as non-referable or referable DR based on current clinical standards.
The generated BAMs highlighted known pathologic features including nonperfusion area and retinal fluid.
arXiv Detail & Related papers (2022-12-13T00:45:46Z) - Fuzzy Attention Neural Network to Tackle Discontinuity in Airway
Segmentation [67.19443246236048]
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases.
Some small-sized airway branches (e.g., bronchus and terminaloles) significantly aggravate the difficulty of automatic segmentation.
This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function.
arXiv Detail & Related papers (2022-09-05T16:38:13Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - A Teacher-Student Framework for Semi-supervised Medical Image
Segmentation From Mixed Supervision [62.4773770041279]
We develop a semi-supervised learning framework based on a teacher-student fashion for organ and lesion segmentation.
We show our model is robust to the quality of bounding box and achieves comparable performance compared with full-supervised learning methods.
arXiv Detail & Related papers (2020-10-23T07:58:20Z) - Multi-Task Neural Networks with Spatial Activation for Retinal Vessel
Segmentation and Artery/Vein Classification [49.64863177155927]
We propose a multi-task deep neural network with spatial activation mechanism to segment full retinal vessel, artery and vein simultaneously.
The proposed network achieves pixel-wise accuracy of 95.70% for vessel segmentation, and A/V classification accuracy of 94.50%, which is the state-of-the-art performance for both tasks.
arXiv Detail & Related papers (2020-07-18T05:46:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.