Related papers: RRTS Dataset: A Benchmark Colonoscopy Dataset from Resource-Limited Settings for Computer-Aided Diagnosis Research

RRTS Dataset: A Benchmark Colonoscopy Dataset from Resource-Limited Settings for Computer-Aided Diagnosis Research

URL: http://arxiv.org/abs/2511.06769v1
Date: Mon, 10 Nov 2025 06:51:41 GMT
Title: RRTS Dataset: A Benchmark Colonoscopy Dataset from Resource-Limited Settings for Computer-Aided Diagnosis Research
Authors: Ridoy Chandra Shil, Ragib Abid, Tasnia Binte Mamun, Samiul Based Shuvo, Masfique Ahmed Bhuiyan, Jahid Ferdous,
Abstract summary: We introduce a dataset of colonoscopy images collected using Olympus 170 and Pen- tax i-Scan series endoscopes.<n>The dataset contains 1,288 images with polyps from 164 patients with corresponding ground-truth masks and 1,657 polyp-free images from 31 patients.<n>Performance was lower compared to curated datasets, reflecting the real-world difficulty of images with artifacts and variable quality.
Score: 0.1407206493229022
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Background and Objective: Colorectal cancer prevention relies on early detection of polyps during colonoscopy. Existing public datasets, such as CVC-ClinicDB and Kvasir-SEG, provide valuable benchmarks but are limited by small sample sizes, curated image selection, or lack of real-world artifacts. There remains a need for datasets that capture the complexity of clinical practice, particularly in resource-constrained settings. Methods: We introduce a dataset, BUET Polyp Dataset (BPD), of colonoscopy images collected using Olympus 170 and Pen- tax i-Scan series endoscopes under routine clinical conditions. The dataset contains images with corresponding expert-annotated binary masks, reflecting diverse challenges such as motion blur, specular highlights, stool artifacts, blood, and low-light frames. Annotations were manually reviewed by clinical experts to ensure quality. To demonstrate baseline performance, we provide bench- mark results for classification using VGG16, ResNet50, and InceptionV3, and for segmentation using UNet variants with VGG16, ResNet34, and InceptionV4 backbones. Results: The dataset comprises 1,288 images with polyps from 164 patients with corresponding ground-truth masks and 1,657 polyp-free images from 31 patients. Benchmarking experiments achieved up to 90.8% accuracy for binary classification (VGG16) and a maximum Dice score of 0.64 with InceptionV4-UNet for segmentation. Performance was lower compared to curated datasets, reflecting the real-world difficulty of images with artifacts and variable quality.

Related papers

Automated Cervical Os Segmentation for Camera-Guided, Speculum-Free Screening [38.85521544870542]
This study evaluates deep learning methods for real-time segmentation of the cervical os in transvaginal endoscopic images.<n>EndoViT/DPT, a vision transformer pre-trained on surgical video, achieved the highest DICE (0.50 pm 0.31) and detection rate (0.87 pm 0.33)<n>These results establish a foundation for integrating automated os recognition into speculum-free cervical screening devices to support non-expert use.
arXiv Detail & Related papers (2025-09-12T14:19:27Z)
AlphaDent: A dataset for automated tooth pathology detection [98.1937495272719]
This dataset is based on the DSLR camera photographs of the teeth of 295 patients and contains over 1200 images.<n>The article provides a detailed description of the dataset and the labeling format.<n>The results obtained show high quality of predictions.
arXiv Detail & Related papers (2025-07-30T09:34:43Z)
Domain-Adaptive Pre-training of Self-Supervised Foundation Models for Medical Image Classification in Gastrointestinal Endoscopy [0.024999074238880488]
Video capsule endoscopy has transformed gastrointestinal endoscopy (GIE) diagnostics by offering a non-invasive method for capturing detailed images of the gastrointestinal tract.<n>Video capsule endoscopy has transformed gastrointestinal endoscopy (GIE) diagnostics by offering a non-invasive method for capturing detailed images of the gastrointestinal tract.<n>However, its potential is limited by the sheer volume of images generated during the imaging procedure, which can take anywhere from 6-8 hours and often produce up to 1 million images.
arXiv Detail & Related papers (2024-10-21T22:52:25Z)
Towards a Benchmark for Colorectal Cancer Segmentation in Endorectal Ultrasound Videos: Dataset and Model Development [59.74920439478643]
In this paper, we collect and annotated the first benchmark dataset that covers diverse ERUS scenarios. Our ERUS-10K dataset comprises 77 videos and 10,000 high-resolution annotated frames. We introduce a benchmark model for colorectal cancer segmentation, named the Adaptive Sparse-context TRansformer (ASTR)
arXiv Detail & Related papers (2024-08-19T15:04:42Z)
A quality assurance framework for real-time monitoring of deep learning segmentation models in radiotherapy [3.5752677591512487]
This work uses cardiac substructure segmentation as an example task to establish a quality assurance framework. A benchmark dataset consisting of Computed Tomography (CT) images along with manual cardiac delineations of 241 patients was collected. An image domain shift detector was developed by utilizing a trained Denoising autoencoder (DAE) and two hand-engineered features. A regression model was trained to predict the per-patient segmentation accuracy, measured by Dice similarity coefficient (DSC)
arXiv Detail & Related papers (2023-05-19T14:51:05Z)
Vision-Language Modelling For Radiological Imaging and Reports In The Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space. We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains. Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z)
Optimising Chest X-Rays for Image Analysis by Identifying and Removing Confounding Factors [49.005337470305584]
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions. The variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance. We propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases.
arXiv Detail & Related papers (2022-08-22T13:57:04Z)
Learning from few examples: Classifying sex from retinal images via deep learning [3.9146761527401424]
We showcase results for the performance of DL on small datasets to classify patient sex from fundus images. Our models, developed using approximately 2500 fundus images, achieved test AUC scores of up to 0.72. This corresponds to a mere 25% decrease in performance despite a nearly 1000-fold decrease in the dataset size.
arXiv Detail & Related papers (2022-07-20T02:47:29Z)
COVIDx CXR-3: A Large-Scale, Open-Source Benchmark Dataset of Chest X-ray Images for Computer-Aided COVID-19 Diagnostics [69.55060769611916]
The use of chest X-ray (CXR) imaging as a complementary screening strategy to RT-PCR testing is increasing. Many visual perception models have been proposed for COVID-19 screening based on CXR imaging. We introduce COVIDx CXR-3, a large-scale benchmark dataset of CXR images for supporting COVID-19 computer vision research.
arXiv Detail & Related papers (2022-06-08T04:39:44Z)
ERS: a novel comprehensive endoscopy image dataset for machine learning, compliant with the MST 3.0 specification [0.0]
The article presents a new multi-label comprehensive image dataset from flexible endoscopy, colonoscopy and capsule endoscopy, named ERS. The dataset contains around 6000 precisely and 115,000 approximately labeled frames from endoscopy videos, precise and 22,600 approximate segmentation masks, and 1.23 million unlabeled frames from flexible and capsule endoscopy videos.
arXiv Detail & Related papers (2022-01-21T15:39:45Z)
Chest x-ray automated triage: a semiologic approach designed for clinical implementation, exploiting different types of labels through a combination of four Deep Learning architectures [83.48996461770017]
This work presents a Deep Learning method based on the late fusion of different convolutional architectures. We built four training datasets combining images from public chest x-ray datasets and our institutional archive. We trained four different Deep Learning architectures and combined their outputs with a late fusion strategy, obtaining a unified tool.
arXiv Detail & Related papers (2020-12-23T14:38:35Z)
Kvasir-Instrument: Diagnostic and therapeutic tool segmentation dataset in gastrointestinal endoscopy [1.7579113628094125]
Gastrointestinal (GI) pathologies are periodically screened, biopsied, and resected using surgical tools. This dataset consists of $590$ annotated frames containing GI procedure tools such as snares, balloons and biopsy forceps, etc.
arXiv Detail & Related papers (2020-10-23T18:14:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.