Exploring the Effect of Dataset Diversity in Self-Supervised Learning for Surgical Computer Vision
- URL: http://arxiv.org/abs/2407.17904v2
- Date: Fri, 26 Jul 2024 08:33:23 GMT
- Title: Exploring the Effect of Dataset Diversity in Self-Supervised Learning for Surgical Computer Vision
- Authors: Tim J. M. Jaspers, Ronald L. P. D. de Jong, Yasmina Al Khalil, Tijn Zeelenberg, Carolus H. J. Kusters, Yiping Li, Romy C. van Jaarsveld, Franciscus H. A. Bakker, Jelle P. Ruurda, Willem M. Brinkman, Peter H. N. De With, Fons van der Sommen,
- Abstract summary: The impact of surgical computer vision remains limited compared to other medical fields like pathology and radiology.
Recent advancements in self-supervised learning have demonstrated superior performance.
This study investigates the role of dataset diversity in SSL for surgical computer vision.
- Score: 5.782979506525853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Over the past decade, computer vision applications in minimally invasive surgery have rapidly increased. Despite this growth, the impact of surgical computer vision remains limited compared to other medical fields like pathology and radiology, primarily due to the scarcity of representative annotated data. Whereas transfer learning from large annotated datasets such as ImageNet has been conventionally the norm to achieve high-performing models, recent advancements in self-supervised learning (SSL) have demonstrated superior performance. In medical image analysis, in-domain SSL pretraining has already been shown to outperform ImageNet-based initialization. Although unlabeled data in the field of surgical computer vision is abundant, the diversity within this data is limited. This study investigates the role of dataset diversity in SSL for surgical computer vision, comparing procedure-specific datasets against a more heterogeneous general surgical dataset across three different downstream surgical applications. The obtained results show that using solely procedure-specific data can lead to substantial improvements of 13.8%, 9.5%, and 36.8% compared to ImageNet pretraining. However, extending this data with more heterogeneous surgical data further increases performance by an additional 5.0%, 5.2%, and 2.5%, suggesting that increasing diversity within SSL data is beneficial for model performance. The code and pretrained model weights are made publicly available at https://github.com/TimJaspers0801/SurgeNet.
Related papers
- Scaling up self-supervised learning for improved surgical foundation models [7.188884777849523]
This study introduces SurgeNetXL, a novel surgical foundation model that sets a new benchmark in surgical computer vision.
SurgeNetXL achieves consistent top-tier performance across six datasets spanning four surgical procedures and three tasks.
These findings pave the way for improved generalizability and robustness in data-scarce scenarios.
arXiv Detail & Related papers (2025-01-16T10:07:44Z) - Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images [67.66644395272075]
We present first analysis of state-of-the-art semantic segmentation models when faced with geometric out-of-distribution data.
We propose an augmentation technique called "Organ Transplantation" to enhance generalizability.
Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2024-08-27T19:13:15Z) - Comparative Analysis of ImageNet Pre-Trained Deep Learning Models and
DINOv2 in Medical Imaging Classification [7.205610366609243]
In this paper, we performed a glioma grading task using three clinical modalities of brain MRI data.
We compared the performance of various pre-trained deep learning models, including those based on ImageNet and DINOv2.
Our findings indicate that in our clinical dataset, DINOv2's performance was not as strong as ImageNet-based pre-trained models.
arXiv Detail & Related papers (2024-02-12T11:49:08Z) - Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - Jumpstarting Surgical Computer Vision [2.585559512929966]
We develop recommendations for pretraining dataset composition through over 300 experiments.<n>We outperform state-of-the-art pre-trainings on two public benchmarks for phase recognition.
arXiv Detail & Related papers (2023-12-10T18:54:16Z) - Transfer learning from a sparsely annotated dataset of 3D medical images [4.477071833136902]
This study explores the use of transfer learning to improve the performance of deep convolutional neural networks for organ segmentation in medical imaging.
A base segmentation model was trained on a large and sparsely annotated dataset; its weights were used for transfer learning on four new down-stream segmentation tasks.
The results showed that transfer learning from the base model was beneficial when small datasets were available.
arXiv Detail & Related papers (2023-11-08T21:31:02Z) - Computational Pathology at Health System Scale -- Self-Supervised
Foundation Models from Three Billion Images [30.618749295623363]
This project aims to train the largest academic foundation model and benchmark the most prominent self-supervised learning algorithms by pre-training.
We collected the largest pathology dataset to date, consisting of over 3 billion images from over 423 thousand microscopy slides.
Our results demonstrate that pre-training on pathology data is beneficial for downstream performance compared to pre-training on natural images.
arXiv Detail & Related papers (2023-10-10T21:40:19Z) - Realistic Data Enrichment for Robust Image Segmentation in
Histopathology [2.248423960136122]
We propose a new approach, based on diffusion models, which can enrich an imbalanced dataset with plausible examples from underrepresented groups.
Our method can simply expand limited clinical datasets making them suitable to train machine learning pipelines.
arXiv Detail & Related papers (2023-04-19T09:52:50Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Semantic segmentation of surgical hyperspectral images under geometric
domain shifts [69.91792194237212]
We present the first analysis of state-of-the-art semantic segmentation networks in the presence of geometric out-of-distribution (OOD) data.
We also address generalizability with a dedicated augmentation technique termed "Organ Transplantation"
Our scheme improves on the SOA DSC by up to 67 % (RGB) and 90 % (HSI) and renders performance on par with in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2023-03-20T09:50:07Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.