Enhanced Multi-Class Classification of Gastrointestinal Endoscopic Images with Interpretable Deep Learning Model
- URL: http://arxiv.org/abs/2503.00780v1
- Date: Sun, 02 Mar 2025 08:07:50 GMT
- Title: Enhanced Multi-Class Classification of Gastrointestinal Endoscopic Images with Interpretable Deep Learning Model
- Authors: Astitva Kamble, Vani Bandodkar, Saakshi Dharmadhikary, Veena Anand, Pradyut Kumar Sanki, Mei X. Wu, Biswabandhu Jana,
- Abstract summary: This research introduces a novel approach to enhance classification accuracy using 8,000 labeled endoscopic images from the Kvasir dataset.<n>The proposed architecture eliminates reliance on data augmentation while preserving moderate model complexity.<n>The model achieves a test accuracy of 94.25%, alongside precision and recall of 94.29% and 94.24% respectively.
- Score: 0.7349657385817541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Endoscopy serves as an essential procedure for evaluating the gastrointestinal (GI) tract and plays a pivotal role in identifying GI-related disorders. Recent advancements in deep learning have demonstrated substantial progress in detecting abnormalities through intricate models and data augmentation methods.This research introduces a novel approach to enhance classification accuracy using 8,000 labeled endoscopic images from the Kvasir dataset, categorized into eight distinct classes. Leveraging EfficientNetB3 as the backbone, the proposed architecture eliminates reliance on data augmentation while preserving moderate model complexity. The model achieves a test accuracy of 94.25%, alongside precision and recall of 94.29% and 94.24% respectively. Furthermore, Local Interpretable Model-agnostic Explanation (LIME) saliency maps are employed to enhance interpretability by defining critical regions in the images that influenced model predictions. Overall, this work highlights the importance of AI in advancing medical imaging by combining high classification accuracy with interpretability.
Related papers
- Gradient Attention Map Based Verification of Deep Convolutional Neural Networks with Application to X-ray Image Datasets [1.0208529247755187]
We propose a comprehensive verification framework that evaluates model suitability through multiple complementary strategies.
First, we introduce a Gradient Attention Map (GAM)-based approach that analyzes attention patterns using Grad-CAM.
Second, we extend verification to early convolutional feature maps, capturing structural mis-alignments missed by attention alone.
Third, we incorporate an additional garbage class into the classification model to explicitly reject out-of-distribution inputs.
arXiv Detail & Related papers (2025-04-29T23:41:37Z) - DCAT: Dual Cross-Attention Fusion for Disease Classification in Radiological Images with Uncertainty Estimation [0.0]
This paper proposes a novel dual cross-attention fusion model for medical image analysis.
It addresses key challenges in feature integration and interpretability.
The proposed model achieved AUC of 99.75%, 100%, 99.93% and 98.69% and AUPR of 99.81%, 100%, 99.97%, and 96.36% on Covid-19, Tuberculosis, Pneumonia Chest X-ray images and Retinal OCT images respectively.
arXiv Detail & Related papers (2025-03-14T20:28:20Z) - Breast Tumor Classification Using EfficientNet Deep Learning Model [5.062500255359342]
We introduce an intensive data augmentation pipeline and cost-sensitive learning, improving representation and ensuring that the model does not overly favor majority classes.<n>Our results underscore significant improvements in the binary classification performance, achieving an exceptional recall increase for benign cases.
arXiv Detail & Related papers (2024-11-26T20:38:33Z) - Integrating Deep Feature Extraction and Hybrid ResNet-DenseNet Model for Multi-Class Abnormality Detection in Endoscopic Images [0.9374652839580183]
The aim is to automate the identification of ten GI abnormality classes, including angioectasia, bleeding, and ulcers.
The proposed model achieves an overall accuracy of 94% across a well-structured dataset.
arXiv Detail & Related papers (2024-10-24T06:10:31Z) - Domain-Adaptive Pre-training of Self-Supervised Foundation Models for Medical Image Classification in Gastrointestinal Endoscopy [0.024999074238880488]
Video capsule endoscopy has transformed gastrointestinal endoscopy (GIE) diagnostics by offering a non-invasive method for capturing detailed images of the gastrointestinal tract.<n>Video capsule endoscopy has transformed gastrointestinal endoscopy (GIE) diagnostics by offering a non-invasive method for capturing detailed images of the gastrointestinal tract.<n>However, its potential is limited by the sheer volume of images generated during the imaging procedure, which can take anywhere from 6-8 hours and often produce up to 1 million images.
arXiv Detail & Related papers (2024-10-21T22:52:25Z) - Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report
Generation [47.250147322130545]
Image-to-text radiology report generation aims to automatically produce radiology reports that describe the findings in medical images.
Most existing methods focus solely on the image data, disregarding the other patient information accessible to radiologists.
We present a novel multi-modal deep neural network framework for generating chest X-rays reports by integrating structured patient data, such as vital signs and symptoms, alongside unstructured clinical notes.
arXiv Detail & Related papers (2023-11-18T14:37:53Z) - Classification of lung cancer subtypes on CT images with synthetic
pathological priors [41.75054301525535]
Cross-scale associations exist in the image patterns between the same case's CT images and its pathological images.
We propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on CT images.
arXiv Detail & Related papers (2023-08-09T02:04:05Z) - MedFMC: A Real-world Dataset and Benchmark For Foundation Model
Adaptation in Medical Image Classification [41.16626194300303]
Foundation models, often pre-trained with large-scale data, have achieved paramount success in jump-starting various vision and language applications.
Recent advances further enable adapting foundation models in downstream tasks efficiently using only a few training samples.
Yet, the application of such learning paradigms in medical image analysis remains scarce due to the shortage of publicly accessible data and benchmarks.
arXiv Detail & Related papers (2023-06-16T01:46:07Z) - SSL-CPCD: Self-supervised learning with composite pretext-class
discrimination for improved generalisability in endoscopic image analysis [3.1542695050861544]
Deep learning-based supervised methods are widely popular in medical image analysis.
They require a large amount of training data and face issues in generalisability to unseen datasets.
We propose to explore patch-level instance-group discrimination and penalisation of inter-class variation using additive angular margin.
arXiv Detail & Related papers (2023-05-31T21:28:08Z) - Generative models improve fairness of medical classifiers under
distribution shifts [49.10233060774818]
We show that learning realistic augmentations automatically from data is possible in a label-efficient manner using generative models.
We demonstrate that these learned augmentations can surpass ones by making models more robust and statistically fair in- and out-of-distribution.
arXiv Detail & Related papers (2023-04-18T18:15:38Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z) - A Multi-resolution Model for Histopathology Image Classification and
Localization with Multiple Instance Learning [9.36505887990307]
We propose a multi-resolution multiple instance learning model that leverages saliency maps to detect suspicious regions for fine-grained grade prediction.
The model is developed on a large-scale prostate biopsy dataset containing 20,229 slides from 830 patients.
The model achieved 92.7% accuracy, 81.8% Cohen's Kappa for benign, low grade (i.e. Grade group 1) and high grade (i.e. Grade group >= 2) prediction, an area under the receiver operating characteristic curve (AUROC) of 98.2% and an average precision (AP) of 97.4%.
arXiv Detail & Related papers (2020-11-05T06:42:39Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.