Colo-SCRL: Self-Supervised Contrastive Representation Learning for
Colonoscopic Video Retrieval
- URL: http://arxiv.org/abs/2303.15671v1
- Date: Tue, 28 Mar 2023 01:27:23 GMT
- Title: Colo-SCRL: Self-Supervised Contrastive Representation Learning for
Colonoscopic Video Retrieval
- Authors: Qingzhong Chen, Shilun Cai, Crystal Cai, Zefang Yu, Dahong Qian,
Suncheng Xiang
- Abstract summary: We construct a large-scale colonoscopic dataset named Colo-Pair for medical practice.
Based on this dataset, a simple yet effective training method called Colo-SCRL is proposed for more robust representation learning.
It aims to refine general knowledge from colonoscopies through masked autoencoder-based reconstruction and momentum contrast to improve retrieval performance.
- Score: 2.868043986903368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Colonoscopic video retrieval, which is a critical part of polyp treatment,
has great clinical significance for the prevention and treatment of colorectal
cancer. However, retrieval models trained on action recognition datasets
usually produce unsatisfactory retrieval results on colonoscopic datasets due
to the large domain gap between them. To seek a solution to this problem, we
construct a large-scale colonoscopic dataset named Colo-Pair for medical
practice. Based on this dataset, a simple yet effective training method called
Colo-SCRL is proposed for more robust representation learning. It aims to
refine general knowledge from colonoscopies through masked autoencoder-based
reconstruction and momentum contrast to improve retrieval performance. To the
best of our knowledge, this is the first attempt to employ the contrastive
learning paradigm for medical video retrieval. Empirical results show that our
method significantly outperforms current state-of-the-art methods in the
colonoscopic video retrieval task.
Related papers
- Frontiers in Intelligent Colonoscopy [96.57251132744446]
This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications.
We assess the current data-centric and model-centric landscapes through four tasks for colonoscopic scene perception.
To embrace the coming multimodal era, we establish three foundational initiatives: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal benchmark.
arXiv Detail & Related papers (2024-10-22T17:57:12Z) - SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation [4.027361638728112]
We propose a video polyp segmentation method that performs self-supervised learning as an auxiliary task and a spatial-temporal self-attention mechanism for improved representation learning.
Our experimental results demonstrate an improvement with respect to several state-of-the-art (SOTA) methods.
Our ablation study confirms that the choice of the proposed joint end-to-end training improves network accuracy by over 3% and nearly 10% on both the Dice similarity coefficient and intersection-over-union.
arXiv Detail & Related papers (2024-06-14T17:33:11Z) - Consisaug: A Consistency-based Augmentation for Polyp Detection in Endoscopy Image Analysis [3.716941460306804]
We introduce Consisaug, an innovative and effective methodology to augment data that leverages deep learning.
We implement our Consisaug on five public polyp datasets and at three backbones, and the results show the effectiveness of our method.
arXiv Detail & Related papers (2024-04-17T13:09:44Z) - REAL-Colon: A dataset for developing real-world AI applications in
colonoscopy [1.8590283101866463]
We introduce the REAL-Colon (Real-world multi-center Endoscopy Annotated video Library) dataset.
It is a compilation of 2.7M native video frames from sixty full-resolution, real-world colonoscopy recordings across multiple centers.
The dataset contains 350k bounding-box annotations, each created under the supervision of expert gastroenterologists.
arXiv Detail & Related papers (2024-03-04T16:11:41Z) - Towards Discriminative Representation with Meta-learning for
Colonoscopic Polyp Re-Identification [2.78481408391119]
Colonoscopic Polyp Re-Identification aims to match the same polyp from a large gallery with images from different views taken using different cameras.
Traditional methods for object ReID directly adopting CNN models trained on the ImageNet dataset produce unsatisfactory retrieval performance.
We propose a simple but effective training method named Colo-ReID, which can help our model learn more general and discriminative knowledge.
arXiv Detail & Related papers (2023-08-02T04:10:14Z) - Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical
Scene Segmentation with Limited Annotations [72.15956198507281]
We propose PGV-CL, a novel pseudo-label guided cross-video contrast learning method to boost scene segmentation.
We extensively evaluate our method on a public robotic surgery dataset EndoVis18 and a public cataract dataset CaDIS.
arXiv Detail & Related papers (2022-07-20T05:42:19Z) - GAN Inversion for Data Augmentation to Improve Colonoscopy Lesion
Classification [3.0100246737240877]
We show that synthetic colonoscopy images generated by Generative Adversarial Network (GAN) inversion can be used as training data to improve the lesion classification performance of deep learning models.
This approach inverts pairs of images with the same label to a semantically rich & disentangled latent space and manipulates latent representations to produce new synthetic images with the same label.
We also generate realistic-looking synthetic lesion images by interpolating between original training images to increase the variety of lesion shapes in the training dataset.
arXiv Detail & Related papers (2022-05-04T23:15:45Z) - Open-Set Recognition of Breast Cancer Treatments [91.3247063132127]
Open-set recognition generalizes a classification task by classifying test samples as one of the known classes from training or "unknown"
We apply a recent existing Gaussian mixture variational autoencoder model, which achieves state-of-the-art results for image datasets, to breast cancer patient data.
Not only do we obtain more accurate and robust classification results, with a 24.5% average F1 increase compared to a recent method, but we also reexamine open-set recognition in terms of deployability to a clinical setting.
arXiv Detail & Related papers (2022-01-09T04:35:55Z) - Colorectal Polyp Classification from White-light Colonoscopy Images via
Domain Alignment [57.419727894848485]
A computer-aided diagnosis system is required to assist accurate diagnosis from colonoscopy images.
Most previous studies at-tempt to develop models for polyp differentiation using Narrow-Band Imaging (NBI) or other enhanced images.
We propose a novel framework based on a teacher-student architecture for the accurate colorectal polyp classification.
arXiv Detail & Related papers (2021-08-05T09:31:46Z) - Colonoscopy Polyp Detection: Domain Adaptation From Medical Report
Images to Real-time Videos [76.37907640271806]
We propose an Image-video-joint polyp detection network (Ivy-Net) to address the domain gap between colonoscopy images from historical medical reports and real-time videos.
Experiments on the collected dataset demonstrate that our Ivy-Net achieves the state-of-the-art result on colonoscopy video.
arXiv Detail & Related papers (2020-12-31T10:33:09Z) - Explaining Clinical Decision Support Systems in Medical Imaging using
Cycle-Consistent Activation Maximization [112.2628296775395]
Clinical decision support using deep neural networks has become a topic of steadily growing interest.
clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend.
We propose a novel decision explanation scheme based on CycleGAN activation which generates high-quality visualizations of classifier decisions even in smaller data sets.
arXiv Detail & Related papers (2020-10-09T14:39:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.