OCT Data is All You Need: How Vision Transformers with and without Pre-training Benefit Imaging
- URL: http://arxiv.org/abs/2502.12379v1
- Date: Mon, 17 Feb 2025 23:31:57 GMT
- Title: OCT Data is All You Need: How Vision Transformers with and without Pre-training Benefit Imaging
- Authors: Zihao Han, Philippe De Wilde,
- Abstract summary: We investigate the impact of ImageNet-based pre-training on Vision Transformer (ViT) performance for OCT image classification across different dataset sizes.<n>Results suggest that while pre-training can accelerate convergence and potentially offer better performance in smaller datasets, training from scratch may achieve comparable or even superior accuracy when sufficient OCT data is available.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optical Coherence Tomography (OCT) provides high-resolution cross-sectional images useful for diagnosing various diseases, but their distinct characteristics from natural images raise questions about whether large-scale pre-training on datasets like ImageNet is always beneficial. In this paper, we investigate the impact of ImageNet-based pre-training on Vision Transformer (ViT) performance for OCT image classification across different dataset sizes. Our experiments cover four-category retinal pathologies (CNV, DME, Drusen, Normal). Results suggest that while pre-training can accelerate convergence and potentially offer better performance in smaller datasets, training from scratch may achieve comparable or even superior accuracy when sufficient OCT data is available. Our findings highlight the importance of matching domain characteristics in pre-training and call for further study on large-scale OCT-specific pre-training.
Related papers
- CAPRI-CT: Causal Analysis and Predictive Reasoning for Image Quality Optimization in Computed Tomography [2.422970122886921]
CAPRI-CT is a causal-aware deep learning framework for Causal Analysis and Predictive Reasoning for Image Quality Optimization in CT imaging.<n>It integrates image data with acquisition metadata to model the underlying causal relationships that influence image quality.<n>It is trained and validated using an ensemble learning approach, achieving strong predictive performance.
arXiv Detail & Related papers (2025-07-23T11:23:02Z) - SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation [20.026663367994356]
Large amounts of full-body CT images provide the opportunity to pre-train powerful models.
It remains unclear in which conditions these pre-trained models can be transferred to various downstream medical segmentation tasks.
We collected 87 public datasets varying in modality, target, and sample size to evaluate the transfer ability of full-body CT pre-trained models.
arXiv Detail & Related papers (2024-11-21T19:00:01Z) - Imaging foundation model for universal enhancement of non-ideal measurement CT [23.678515579203694]
Non-ideal measurement computed tomography (NICT) sacrifices optimal imaging standards for new advantages in CT imaging.
With the reduction of imaging standards, the image quality has also been reduced, limiting the clinical acceptability.
We propose a multi-scale integrated Transformer AMPlifier (TAMP) to bridge the image quality degradation with minimal data cost.
arXiv Detail & Related papers (2024-10-02T14:25:02Z) - Enhancing Retinal Disease Classification from OCTA Images via Active Learning Techniques [0.8035416719640156]
Eye diseases are common in older Americans and can lead to decreased vision and blindness.
Recent advancements in imaging technologies allow clinicians to capture high-quality images of the retinal blood vessels via Optical Coherence Tomography Angiography ( OCTA)
OCTA provides detailed vascular imaging as compared to the solely structural information obtained by common OCT imaging.
arXiv Detail & Related papers (2024-07-21T23:24:49Z) - Deep learning network to correct axial and coronal eye motion in 3D OCT
retinal imaging [65.47834983591957]
We propose deep learning based neural networks to correct axial and coronal motion artifacts in OCT based on a single scan.
The experimental result shows that the proposed method can effectively correct motion artifacts and achieve smaller error than other methods.
arXiv Detail & Related papers (2023-05-27T03:55:19Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Data-Efficient Vision Transformers for Multi-Label Disease
Classification on Chest Radiographs [55.78588835407174]
Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images.
ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present.
Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
arXiv Detail & Related papers (2022-08-17T09:07:45Z) - SD-LayerNet: Semi-supervised retinal layer segmentation in OCT using
disentangled representation with anatomical priors [4.2663199451998475]
We introduce a semi-supervised paradigm into the retinal layer segmentation task.
In particular, a novel fully differentiable approach is used for converting surface position regression into a pixel-wise structured segmentation.
In parallel, we propose a set of anatomical priors to improve network training when a limited amount of labeled data is available.
arXiv Detail & Related papers (2022-07-01T14:30:59Z) - Negligible effect of brain MRI data preprocessing for tumor segmentation [36.89606202543839]
We conduct experiments on three publicly available datasets and evaluate the effect of different preprocessing steps in deep neural networks.
Our results demonstrate that most popular standardization steps add no value to the network performance.
We suggest that image intensity normalization approaches do not contribute to model accuracy because of the reduction of signal variance with image standardization.
arXiv Detail & Related papers (2022-04-11T17:29:36Z) - ROSE: A Retinal OCT-Angiography Vessel Segmentation Dataset and New
Model [41.444917622855606]
We release a dedicated OCT-A SEgmentation dataset (ROSE), which consists of 229 OCT-A images with vessel annotations at either centerline-level or pixel level.
Secondly, we propose a novel Split-based Coarse-to-Fine vessel segmentation network (SCF-Net), with the ability to detect thick and thin vessels separately.
In the SCF-Net, a split-based coarse segmentation (SCS) module is first introduced to produce a preliminary confidence map of vessels, and a split-based refinement (SRN) module is then used to optimize the shape/contour of
arXiv Detail & Related papers (2020-07-10T06:54:19Z) - Data Consistent CT Reconstruction from Insufficient Data with Learned
Prior Images [70.13735569016752]
We investigate the robustness of deep learning in CT image reconstruction by showing false negative and false positive lesion cases.
We propose a data consistent reconstruction (DCR) method to improve their image quality, which combines the advantages of compressed sensing and deep learning.
The efficacy of the proposed method is demonstrated in cone-beam CT with truncated data, limited-angle data and sparse-view data, respectively.
arXiv Detail & Related papers (2020-05-20T13:30:49Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.