Block Expanded DINORET: Adapting Natural Domain Foundation Models for Retinal Imaging Without Catastrophic Forgetting
- URL: http://arxiv.org/abs/2409.17332v1
- Date: Wed, 25 Sep 2024 20:17:16 GMT
- Title: Block Expanded DINORET: Adapting Natural Domain Foundation Models for Retinal Imaging Without Catastrophic Forgetting
- Authors: Jay Zoellin, Colin Merk, Mischa Buob, Amr Saad, Samuel Giesser, Tahm Spitznagel, Ferhat Turgut, Rui Santos, Yukun Zhou, Sigfried Wagner, Pearse A. Keane, Yih Chung Tham, Delia Cabrera DeBuc, Matthias D. Becker, Gabor M. Somfai,
- Abstract summary: We adapted the DINOv2 vision transformer for retinal imaging classification tasks using self-supervised learning.
We generated two novel foundation models termed DINORET and BE DINORET.
Our few-shot learning studies indicated that DINORET and BE DINORET outperform RETFound in terms of data-efficiency.
- Score: 1.2573191100165562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Integrating deep learning into medical imaging is poised to greatly advance diagnostic methods but it faces challenges with generalizability. Foundation models, based on self-supervised learning, address these issues and improve data efficiency. Natural domain foundation models show promise for medical imaging, but systematic research evaluating domain adaptation, especially using self-supervised learning and parameter-efficient fine-tuning, remains underexplored. Additionally, little research addresses the issue of catastrophic forgetting during fine-tuning of foundation models. We adapted the DINOv2 vision transformer for retinal imaging classification tasks using self-supervised learning and generated two novel foundation models termed DINORET and BE DINORET. Publicly available color fundus photographs were employed for model development and subsequent fine-tuning for diabetic retinopathy staging and glaucoma detection. We introduced block expansion as a novel domain adaptation strategy and assessed the models for catastrophic forgetting. Models were benchmarked to RETFound, a state-of-the-art foundation model in ophthalmology. DINORET and BE DINORET demonstrated competitive performance on retinal imaging tasks, with the block expanded model achieving the highest scores on most datasets. Block expansion successfully mitigated catastrophic forgetting. Our few-shot learning studies indicated that DINORET and BE DINORET outperform RETFound in terms of data-efficiency. This study highlights the potential of adapting natural domain vision models to retinal imaging using self-supervised learning and block expansion. BE DINORET offers robust performance without sacrificing previously acquired capabilities. Our findings suggest that these methods could enable healthcare institutions to develop tailored vision models for their patient populations, enhancing global healthcare inclusivity.
Related papers
- A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks [0.0]
We developed a Fundus-Specific Pretrained Model (Image+Fundus), a supervised artificial intelligence model trained to detect abnormalities in fundus images.
A total of 57,803 images were used to develop this pretrained model, which achieved superior performance across various downstream tasks.
arXiv Detail & Related papers (2024-08-16T15:03:06Z) - Enhance Eye Disease Detection using Learnable Probabilistic Discrete Latents in Machine Learning Architectures [1.6000489723889526]
Ocular diseases, including diabetic retinopathy and glaucoma, present a significant public health challenge.
Deep learning models have emerged as powerful tools for analysing medical images, such as retina imaging.
Challenges persist in model relibability and uncertainty estimation, which are critical for clinical decision-making.
arXiv Detail & Related papers (2024-01-21T04:14:54Z) - Evaluating General Purpose Vision Foundation Models for Medical Image Analysis: An Experimental Study of DINOv2 on Radiology Benchmarks [5.8941124219471055]
DINOv2 is an open-source foundation model pre-trained with self-supervised learning on 142 million curated natural images.
This study comprehensively evaluates the performance DINOv2 for radiology.
arXiv Detail & Related papers (2023-12-04T21:47:10Z) - Enhancing and Adapting in the Clinic: Source-free Unsupervised Domain
Adaptation for Medical Image Enhancement [34.11633495477596]
We propose an algorithm for source-free unsupervised domain adaptive medical image enhancement (SAME)
A structure-preserving enhancement network is first constructed to learn a robust source model from synthesized training data.
A pseudo-label picker is developed to boost the knowledge distillation of enhancement tasks.
arXiv Detail & Related papers (2023-12-03T10:01:59Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - USIM-DAL: Uncertainty-aware Statistical Image Modeling-based Dense
Active Learning for Super-resolution [47.38982697349244]
Dense regression is a widely used approach in computer vision for tasks such as image super-resolution, enhancement, depth estimation, etc.
We propose incorporating active learning into dense regression models to address this problem.
Active learning allows models to select the most informative samples for labeling, reducing the overall annotation cost while improving performance.
arXiv Detail & Related papers (2023-05-27T16:33:43Z) - Bridging Synthetic and Real Images: a Transferable and Multiple
Consistency aided Fundus Image Enhancement Framework [61.74188977009786]
We propose an end-to-end optimized teacher-student framework to simultaneously conduct image enhancement and domain adaptation.
We also propose a novel multi-stage multi-attention guided enhancement network (MAGE-Net) as the backbones of our teacher and student network.
arXiv Detail & Related papers (2023-02-23T06:16:15Z) - Continual Learning with Bayesian Model based on a Fixed Pre-trained
Feature Extractor [55.9023096444383]
Current deep learning models are characterised by catastrophic forgetting of old knowledge when learning new classes.
Inspired by the process of learning new knowledge in human brains, we propose a Bayesian generative model for continual learning.
arXiv Detail & Related papers (2022-04-28T08:41:51Z) - ROCT-Net: A new ensemble deep convolutional model with improved spatial
resolution learning for detecting common diseases from retinal OCT images [0.0]
This paper presents a new enhanced deep ensemble convolutional neural network for detecting retinal diseases from OCT images.
Our model generates rich and multi-resolution features by employing the learning architectures of two robust convolutional models.
Our experiments on two datasets and comparing our model with some other well-known deep convolutional neural networks have proven that our architecture can increase the classification accuracy up to 5%.
arXiv Detail & Related papers (2022-03-03T17:51:01Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.