Related papers: UOPSL: Unpaired OCT Predilection Sites Learning for Fundus Image Diagnosis Augmentation

UOPSL: Unpaired OCT Predilection Sites Learning for Fundus Image Diagnosis Augmentation

URL: http://arxiv.org/abs/2509.08624v1
Date: Wed, 10 Sep 2025 14:19:59 GMT
Title: UOPSL: Unpaired OCT Predilection Sites Learning for Fundus Image Diagnosis Augmentation
Authors: Zhihao Zhao, Yinzheng Zhao, Junjie Yang, Xiangtong Yao, Quanmin Liang, Daniel Zapp, Kai Huang, Nassir Navab, M. Ali Nasseri,
Abstract summary: We propose a novel unpaired multimodal framework UOPSL that utilizes extensive OCT-derived spatial priors to dynamically identify predilection sites.<n>Our approach bridges unpaired fundus and OCTs via extended disease text descriptions.<n>Experiments conducted on 9 diverse datasets across 28 critical categories demonstrate that our framework outperforms existing benchmarks.
Score: 47.08936359575974
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Significant advancements in AI-driven multimodal medical image diagnosis have led to substantial improvements in ophthalmic disease identification in recent years. However, acquiring paired multimodal ophthalmic images remains prohibitively expensive. While fundus photography is simple and cost-effective, the limited availability of OCT data and inherent modality imbalance hinder further progress. Conventional approaches that rely solely on fundus or textual features often fail to capture fine-grained spatial information, as each imaging modality provides distinct cues about lesion predilection sites. In this study, we propose a novel unpaired multimodal framework \UOPSL that utilizes extensive OCT-derived spatial priors to dynamically identify predilection sites, enhancing fundus image-based disease recognition. Our approach bridges unpaired fundus and OCTs via extended disease text descriptions. Initially, we employ contrastive learning on a large corpus of unpaired OCT and fundus images while simultaneously learning the predilection sites matrix in the OCT latent space. Through extensive optimization, this matrix captures lesion localization patterns within the OCT feature space. During the fine-tuning or inference phase of the downstream classification task based solely on fundus images, where paired OCT data is unavailable, we eliminate OCT input and utilize the predilection sites matrix to assist in fundus image classification learning. Extensive experiments conducted on 9 diverse datasets across 28 critical categories demonstrate that our framework outperforms existing benchmarks.

Related papers

A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z)
Bayesian Deep Learning Approaches for Uncertainty-Aware Retinal OCT Image Segmentation for Multiple Sclerosis [0.0]
Optical Coherence Tomography ( OCT) provides valuable insights in ophthalmology, cardiology, and neurology.<n>One critical task for ophthalmologists using OCT is delineation of retinal layers within scans.<n>Previous efforts to automate delineation using deep learning face challenges in uptake from clinicians and statisticians.
arXiv Detail & Related papers (2025-05-17T15:56:17Z)
MultiEYE: Dataset and Benchmark for OCT-Enhanced Retinal Disease Recognition from Fundus Images [4.885485496458059]
We present the first large multi-modal multi-class dataset for eye disease diagnosis, MultiEYE.<n>We propose an OCT-assisted Conceptual Distillation Approach ( OCT-CoDA) to extract disease-related knowledge from OCT images.<n>Our proposed OCT-CoDA demonstrates remarkable results and interpretability, showing great potential for clinical application.
arXiv Detail & Related papers (2024-12-12T16:08:43Z)
Enhancing Retinal Disease Classification from OCTA Images via Active Learning Techniques [0.8035416719640156]
Eye diseases are common in older Americans and can lead to decreased vision and blindness. Recent advancements in imaging technologies allow clinicians to capture high-quality images of the retinal blood vessels via Optical Coherence Tomography Angiography ( OCTA) OCTA provides detailed vascular imaging as compared to the solely structural information obtained by common OCT imaging.
arXiv Detail & Related papers (2024-07-21T23:24:49Z)
Fundus-Enhanced Disease-Aware Distillation Model for Retinal Disease Classification from OCT Images [6.72159216082989]
We propose a fundus-enhanced disease-aware distillation model for retinal disease classification from OCT images. Our framework enhances the OCT model during training by utilizing unpaired fundus images. Our proposed approach outperforms single-modal, multi-modal, and state-of-the-art distillation methods for retinal disease classification.
arXiv Detail & Related papers (2023-08-01T05:13:02Z)
Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images. We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations. The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z)
The Whole Pathological Slide Classification via Weakly Supervised Learning [7.313528558452559]
We introduce two pathological priors: nuclear disease of cells and spatial correlation of pathological tiles. We propose a data augmentation method that utilizes stain separation during extractor training. We then describe the spatial relationships between the tiles using an adjacency matrix. By integrating these two views, we designed a multi-instance framework for analyzing H&E-stained tissue images.
arXiv Detail & Related papers (2023-07-12T16:14:23Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Explaining Clinical Decision Support Systems in Medical Imaging using Cycle-Consistent Activation Maximization [112.2628296775395]
Clinical decision support using deep neural networks has become a topic of steadily growing interest. clinicians are often hesitant to adopt the technology because its underlying decision-making process is considered to be intransparent and difficult to comprehend. We propose a novel decision explanation scheme based on CycleGAN activation which generates high-quality visualizations of classifier decisions even in smaller data sets.
arXiv Detail & Related papers (2020-10-09T14:39:27Z)
Modeling and Enhancing Low-quality Retinal Fundus Images [167.02325845822276]
Low-quality fundus images increase uncertainty in clinical observation and lead to the risk of misdiagnosis. We propose a clinically oriented fundus enhancement network (cofe-Net) to suppress global degradation factors. Experiments on both synthetic and real images demonstrate that our algorithm effectively corrects low-quality fundus images without losing retinal details.
arXiv Detail & Related papers (2020-05-12T08:01:16Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.