Temporally-Aware Supervised Contrastive Learning for Polyp Counting in Colonoscopy
- URL: http://arxiv.org/abs/2507.02493v1
- Date: Thu, 03 Jul 2025 09:55:48 GMT
- Title: Temporally-Aware Supervised Contrastive Learning for Polyp Counting in Colonoscopy
- Authors: Luca Parolari, Andrea Cherubini, Lamberto Ballan, Carlo Biffi,
- Abstract summary: Existing methods for polyp counting rely on self-supervised learning.<n>We introduce a paradigm shift by proposing a supervised contrastive loss that incorporates temporally-aware soft targets.<n>Results demonstrate a 2.2x reduction in fragmentation rate compared to prior approaches.
- Score: 5.7522869823664005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated polyp counting in colonoscopy is a crucial step toward automated procedure reporting and quality control, aiming to enhance the cost-effectiveness of colonoscopy screening. Counting polyps in a procedure involves detecting and tracking polyps, and then clustering tracklets that belong to the same polyp entity. Existing methods for polyp counting rely on self-supervised learning and primarily leverage visual appearance, neglecting temporal relationships in both tracklet feature learning and clustering stages. In this work, we introduce a paradigm shift by proposing a supervised contrastive loss that incorporates temporally-aware soft targets. Our approach captures intra-polyp variability while preserving inter-polyp discriminability, leading to more robust clustering. Additionally, we improve tracklet clustering by integrating a temporal adjacency constraint, reducing false positive re-associations between visually similar but temporally distant tracklets. We train and validate our method on publicly available datasets and evaluate its performance with a leave-one-out cross-validation strategy. Results demonstrate a 2.2x reduction in fragmentation rate compared to prior approaches. Our results highlight the importance of temporal awareness in polyp counting, establishing a new state-of-the-art. Code is available at https://github.com/lparolari/temporally-aware-polyp-counting.
Related papers
- Towards Polyp Counting In Full-Procedure Colonoscopy Videos [5.7522869823664005]
A major challenge lies in the automated identification, tracking, and re-association (ReID) of polyps tracklets across full-procedure colonoscopy videos.<n>In this work, we leverage the REAL-Colon dataset, the first open-access dataset providing full-procedure videos.<n>We re-implement previously proposed SimCLR-based methods for learning representations of polyp tracklets.<n>Our approach achieves state-of-the-art performance, with a polyp fragmentation rate of 6.30 and a false positive rate (FPR) below 5% on the REAL-Colon dataset.
arXiv Detail & Related papers (2025-02-14T10:02:38Z) - TSdetector: Temporal-Spatial Self-correction Collaborative Learning for Colonoscopy Video Detection [19.00902297385955]
We propose a novel Temporal-Spatial self-correction detector (TSdetector), which integrates temporal-level consistency learning and spatial-level reliability learning to detect objects continuously.
The experimental results on three publicly available polyp video dataset show that TSdetector achieves the highest polyp detection rate and outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-09-30T06:19:29Z) - SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation [4.027361638728112]
We propose a video polyp segmentation method that performs self-supervised learning as an auxiliary task and a spatial-temporal self-attention mechanism for improved representation learning.
Our experimental results demonstrate an improvement with respect to several state-of-the-art (SOTA) methods.
Our ablation study confirms that the choice of the proposed joint end-to-end training improves network accuracy by over 3% and nearly 10% on both the Dice similarity coefficient and intersection-over-union.
arXiv Detail & Related papers (2024-06-14T17:33:11Z) - Consisaug: A Consistency-based Augmentation for Polyp Detection in Endoscopy Image Analysis [3.716941460306804]
We introduce Consisaug, an innovative and effective methodology to augment data that leverages deep learning.
We implement our Consisaug on five public polyp datasets and at three backbones, and the results show the effectiveness of our method.
arXiv Detail & Related papers (2024-04-17T13:09:44Z) - ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic
Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases.
In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework.
Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps.
In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z) - Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges [58.32937972322058]
"Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image (MedAI 2021)" competitions.
We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic.
arXiv Detail & Related papers (2023-07-30T16:08:45Z) - Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training [55.12082817901671]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT)<n>MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies.<n>Our results demonstrate that MaPeT achieves competitive performance on ImageNet, compared to baselines and competitors under the same model setting.
arXiv Detail & Related papers (2023-06-12T18:12:19Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - PraNet: Parallel Reverse Attention Network for Polyp Segmentation [155.93344756264824]
We propose a parallel reverse attention network (PraNet) for accurate polyp segmentation in colonoscopy images.
We first aggregate the features in high-level layers using a parallel partial decoder (PPD)
In addition, we mine the boundary cues using a reverse attention (RA) module, which is able to establish the relationship between areas and boundary cues.
arXiv Detail & Related papers (2020-06-13T08:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.