Reliable Mislabel Detection for Video Capsule Endoscopy Data
- URL: http://arxiv.org/abs/2602.06938v1
- Date: Fri, 06 Feb 2026 18:33:12 GMT
- Title: Reliable Mislabel Detection for Video Capsule Endoscopy Data
- Authors: Julia Werner, Julius Oexle, Oliver Bause, Maxime Le Floch, Franz Brinkmann, Hannah Tolle, Jochen Hampe, Oliver Bringmann,
- Abstract summary: We introduce a framework for mislabel detection in medical datasets.<n>This is validated on the two largest, publicly available datasets for Video Capsule Endoscopy.<n>Our results show that the proposed framework successfully detects incorrectly labeled data and results in an improved anomaly detection performance.
- Score: 0.6746617619581846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The classification performance of deep neural networks relies strongly on access to large, accurately annotated datasets. In medical imaging, however, obtaining such datasets is particularly challenging since annotations must be provided by specialized physicians, which severely limits the pool of annotators. Furthermore, class boundaries can often be ambiguous or difficult to define which further complicates machine learning-based classification. In this paper, we want to address this problem and introduce a framework for mislabel detection in medical datasets. This is validated on the two largest, publicly available datasets for Video Capsule Endoscopy, an important imaging procedure for examining the gastrointestinal tract based on a video stream of lowresolution images. In addition, potentially mislabeled samples identified by our pipeline were reviewed and re-annotated by three experienced gastroenterologists. Our results show that the proposed framework successfully detects incorrectly labeled data and results in an improved anomaly detection performance after cleaning the datasets compared to current baselines.
Related papers
- Diffusion-Based Data Augmentation for Medical Image Segmentation [2.841725244360927]
DiffAug is a novel framework that combines textguided diffusion-based generation and automatic segmentation validation.<n>Our framework achieves state-of-the-art performance with 8-10% Dice improvements over baselines.
arXiv Detail & Related papers (2025-08-25T09:49:27Z) - Demographic-aware fine-grained classification of pediatric wrist fractures [4.309673738288069]
Computer vision presents a promising avenue, contingent upon the availability of extensive datasets.<n>This study addresses the problem using a multifaceted approach: framing it as a fine-grained recognition task, fusing patient metadata with X-rays, and leveraging weights from a separate fine-grained dataset.<n>Results show that combining fine-grained transformer approach, fine-grained pre-training, and metadata integration improves diagnostic accuracy by 2% on small custom curated dataset and over 10% on a larger fracture dataset.
arXiv Detail & Related papers (2025-07-17T10:03:57Z) - Multi-task Explainable Skin Lesion Classification [54.76511683427566]
We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
arXiv Detail & Related papers (2023-10-11T05:49:47Z) - GastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided
Gastrointestinal Disease Detection [6.231109933741383]
This dataset includes different anatomical landmarks, pathological abnormalities, polyp removal cases and normal findings from the GI tract.
It was annotated and verified by experienced GI endoscopists.
We believe our dataset can facilitate the development of AI-based algorithms for GI disease detection and classification.
arXiv Detail & Related papers (2023-07-16T19:36:03Z) - Weakly Supervised Learning Significantly Reduces the Number of Labels
Required for Intracranial Hemorrhage Detection on Head CT [7.713240800142863]
Machine learning pipelines, in particular those based on deep learning (DL) models, require large amounts of labeled data.
This work studies the question of what kind of labels should be collected for the problem of intracranial hemorrhage detection in brain CT.
We find that strong supervision (i.e., learning with local image-level annotations) and weak supervision (i.e., learning with only global examination-level labels) achieve comparable performance.
arXiv Detail & Related papers (2022-11-29T04:42:41Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - Anatomy-Guided Weakly-Supervised Abnormality Localization in Chest
X-rays [17.15666977702355]
We propose an Anatomy-Guided chest X-ray Network (AGXNet) to address weak annotation issues.
Our framework consists of a cascade of two networks, one responsible for identifying anatomical abnormalities and the second responsible for pathological observations.
Our results on the MIMIC-CXR dataset demonstrate the effectiveness of AGXNet in disease and anatomical abnormality localization.
arXiv Detail & Related papers (2022-06-25T18:33:27Z) - Self-Supervised Learning as a Means To Reduce the Need for Labeled Data
in Medical Image Analysis [64.4093648042484]
We use a dataset of chest X-ray images with bounding box labels for 13 different classes of anomalies.
We show that it is possible to achieve similar performance to a fully supervised model in terms of mean average precision and accuracy with only 60% of the labeled data.
arXiv Detail & Related papers (2022-06-01T09:20:30Z) - Label-Assemble: Leveraging Multiple Datasets with Partial Labels [68.46767639240564]
"Label-Assemble" aims to unleash the full potential of partial labels from an assembly of public datasets.
We discovered that learning from negative examples facilitates both computer-aided disease diagnosis and detection.
arXiv Detail & Related papers (2021-09-25T02:48:17Z) - Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions
Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets.
A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies.
A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Semi-supervised lung nodule retrieval [2.055949720959582]
A content based image retrieval (CBIR) system provides as its output a set of images, ranked by similarity to the query image.
Ground truth on similarity between dataset elements (e.g. between nodules) is not readily available, thus greatly challenging machine learning methods.
The current study suggests a semi-supervised approach that involves two steps: 1) Automatic annotation of a given partially labeled dataset; 2) Learning a semantic similarity metric space based on the predicated annotations.
The proposed system is demonstrated in lung nodule retrieval using the LIDC dataset, and shows that it is feasible to learn embedding from predicted ratings.
arXiv Detail & Related papers (2020-05-04T19:26:14Z) - Do Public Datasets Assure Unbiased Comparisons for Registration
Evaluation? [96.53940048041248]
We use the variogram to screen the manually annotated landmarks in two datasets used to benchmark registration in image-guided neurosurgeries.
Using variograms, we identified potentially problematic cases and had them examined by experienced radiologists.
arXiv Detail & Related papers (2020-03-20T20:04:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.