Full end-to-end diagnostic workflow automation of 3D OCT via foundation model-driven AI for retinal diseases
- URL: http://arxiv.org/abs/2602.03302v1
- Date: Tue, 03 Feb 2026 09:28:51 GMT
- Title: Full end-to-end diagnostic workflow automation of 3D OCT via foundation model-driven AI for retinal diseases
- Authors: Jinze Zhang, Jian Zhong, Li Lin, Jiaxiong Li, Ke Ma, Naiyang Li, Meng Li, Yuan Pan, Zeyu Meng, Mengyun Zhou, Shang Huang, Shilong Yu, Zhengyu Duan, Sutong Li, Honghui Xia, Juping Liu, Dan Liang, Yantao Wei, Xiaoying Tang, Jin Yuan, Peng Xiao,
- Abstract summary: We present Full-process OCT-based Clinical Utility System (FOCUS)<n>FOCUS is a foundation model-driven framework enabling end-to-end automation of 3D OCT retinal disease diagnosis.<n>We trained and tested on 3,300 patients (40,672 slices), and externally validated on 1,345 patients (18,498 slices) across four different-tier centers and diverse OCT devices.
- Score: 14.269884626099625
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optical coherence tomography (OCT) has revolutionized retinal disease diagnosis with its high-resolution and three-dimensional imaging nature, yet its full diagnostic automation in clinical practices remains constrained by multi-stage workflows and conventional single-slice single-task AI models. We present Full-process OCT-based Clinical Utility System (FOCUS), a foundation model-driven framework enabling end-to-end automation of 3D OCT retinal disease diagnosis. FOCUS sequentially performs image quality assessment with EfficientNetV2-S, followed by abnormality detection and multi-disease classification using a fine-tuned Vision Foundation Model. Crucially, FOCUS leverages a unified adaptive aggregation method to intelligently integrate 2D slices-level predictions into comprehensive 3D patient-level diagnosis. Trained and tested on 3,300 patients (40,672 slices), and externally validated on 1,345 patients (18,498 slices) across four different-tier centers and diverse OCT devices, FOCUS achieved high F1 scores for quality assessment (99.01%), abnormally detection (97.46%), and patient-level diagnosis (94.39%). Real-world validation across centers also showed stable performance (F1: 90.22%-95.24%). In human-machine comparisons, FOCUS matched expert performance in abnormality detection (F1: 95.47% vs 90.91%) and multi-disease diagnosis (F1: 93.49% vs 91.35%), while demonstrating better efficiency. FOCUS automates the image-to-diagnosis pipeline, representing a critical advance towards unmanned ophthalmology with a validated blueprint for autonomous screening to enhance population scale retinal care accessibility and efficiency.
Related papers
- Automated Classification of First-Trimester Fetal Heart Views Using Ultrasound-Specific Self-Supervised Learning [0.205246094017924]
We evaluate a self-supervised ultrasound foundation model, USF-MAE, for first-trimester fetal heart view classification.<n> USF-MAE is pretrained using masked autoencoding modelling on more than 370,000 unlabelled ultrasound images.<n>It achieved the highest performance across all evaluation metrics, with 90.57% accuracy, 91.15% precision, 90.57% recall, and 90.71% F1-score.
arXiv Detail & Related papers (2025-12-30T22:24:26Z) - HistoART: Histopathology Artifact Detection and Reporting Tool [37.31105955164019]
Whole Slide Imaging (WSI) is widely used to digitize tissue specimens for detailed, high-resolution examination.<n>WSI remains vulnerable to artifacts introduced during slide preparation and scanning.<n>We propose and compare three robust artifact detection approaches for WSIs.
arXiv Detail & Related papers (2025-06-23T17:22:19Z) - APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs [42.58128666405841]
The Asia Pacific Tele-Ophthalmology Society organized a challenge titled Artificial Intelligence-based OCT Generation from Fundus Images.<n>This paper details the challenge framework (referred to as APTOS-2024 Challenge), including the benchmark dataset.<n>The challenge attracted 342 participating teams, with 42 preliminary submissions and 9 finalists.
arXiv Detail & Related papers (2025-06-09T08:29:37Z) - A Clinician-Friendly Platform for Ophthalmic Image Analysis Without Technical Barriers [51.45596445363302]
GlobeReady is a clinician-friendly AI platform that enables fundus disease diagnosis without retraining, fine-tuning, or the needs for technical expertise.<n>We demonstrate high accuracy across imaging modalities: 93.9-98.5% for 11 fundus diseases using color fundus photographs (CPFs) and 87.2-92.7% for 15 fundus diseases using optic coherence tomography ( OCT) scans.<n>By leveraging training-free local feature augmentation, GlobeReady platform effectively mitigates domain shifts across centers and populations.
arXiv Detail & Related papers (2025-04-22T14:17:22Z) - Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging [41.446379453352534]
Latent Diffusion Autoencoder (LDAE) is a novel encoder-decoder diffusion-based framework for efficient and meaningful unsupervised learning in medical imaging.<n>This study focuses on Alzheimer disease (AD) using brain MR from the ADNI database as a case study.
arXiv Detail & Related papers (2025-04-11T15:37:46Z) - Breast Cancer Histopathology Classification using CBAM-EfficientNetV2 with Transfer Learning [0.0]
This study introduces a novel approach leveraging EfficientNetV2 models to improve feature extraction and focus on relevant tissue regions.<n>The EfficientNetV2-XL with CBAM achieved outstanding performance, reaching a peak accuracy of 99.01 percent and an F1-score of 98.31 percent at 400X magnification.
arXiv Detail & Related papers (2024-10-29T17:56:05Z) - Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images [41.002573031087856]
We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography ( OCT)
FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RETFound and UIOS, and got further improvement with thresholding strategy to 98.44%.
Our model is superior to two ophthalmologists with a higher F1 score (95.17% vs. 61.93% &71.72%)
arXiv Detail & Related papers (2024-06-18T03:04:52Z) - Adaptive Multiscale Retinal Diagnosis: A Hybrid Trio-Model Approach for Comprehensive Fundus Multi-Disease Detection Leveraging Transfer Learning and Siamese Networks [0.0]
WHO has declared that more than 2.2 billion people worldwide are suffering from visual disorders, such as media haze, glaucoma, and drusen.
At least 1 billion of these cases could have been either prevented or successfully treated, yet they remain unaddressed due to poverty, a lack of specialists, inaccurate ocular fundus diagnoses by ophthalmologists, or the presence of a rare disease.
To address this, the research has developed the Hybrid Trio-Network Model Algorithm for accurately diagnosing 12 distinct common and rare eye diseases.
arXiv Detail & Related papers (2024-05-28T03:06:10Z) - Spatial-aware Transformer-GRU Framework for Enhanced Glaucoma Diagnosis from 3D OCT Imaging [3.093890460224435]
We present a novel deep learning framework that leverages the diagnostic value of 3D Optical Coherence Tomography ( OCT) imaging for automated glaucoma detection.<n>We integrate a pre-trained Vision Transformer on retinal data for rich slice-wise feature extraction and a bidirectional Gated Recurrent Unit for capturing inter-slice spatial dependencies.
arXiv Detail & Related papers (2024-03-08T22:25:15Z) - Uncertainty-inspired Open Set Learning for Retinal Anomaly
Identification [71.06194656633447]
We establish an uncertainty-inspired open-set (UIOS) model, which was trained with fundus images of 9 retinal conditions.
Our UIOS model with thresholding strategy achieved an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set.
UIOS correctly predicted high uncertainty scores, which would prompt the need for a manual check in the datasets of non-target categories retinal diseases, low-quality fundus images, and non-fundus images.
arXiv Detail & Related papers (2023-04-08T10:47:41Z) - Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in
Artificial Intelligence [79.038671794961]
We launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution.
Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK.
arXiv Detail & Related papers (2021-11-18T00:43:41Z) - COVID-MTL: Multitask Learning with Shift3D and Random-weighted Loss for
Automated Diagnosis and Severity Assessment of COVID-19 [39.57518533765393]
There is an urgent need for automated methods to assist accurate and effective assessment of COVID-19.
We present an end-to-end multitask learning framework (COVID-MTL) that is capable of automated and simultaneous detection (against both radiology and NAT) and severity assessment of COVID-19.
arXiv Detail & Related papers (2020-12-10T08:30:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.