DMS-Net:Dual-Modal Multi-Scale Siamese Network for Binocular Fundus Image Classification
- URL: http://arxiv.org/abs/2504.18046v1
- Date: Fri, 25 Apr 2025 03:27:28 GMT
- Title: DMS-Net:Dual-Modal Multi-Scale Siamese Network for Binocular Fundus Image Classification
- Authors: Guohao Huo, Zibo Lin, Zitong Wang, Ruiting Dai, Hao Tang,
- Abstract summary: Ophthalmic diseases pose a significant global health challenge, yet traditional diagnosis methods often fail to account for binocular pathological correlations.<n>We propose DMS-Net, a dual-modal multi-scale Siamese network for binocular fundus image classification.<n>Our framework leverages weight-shared Siamese ResNet-152 backbones to extract deep semantic features from paired fundus images.
- Score: 8.010725085988296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ophthalmic diseases pose a significant global health challenge, yet traditional diagnosis methods and existing single-eye deep learning approaches often fail to account for binocular pathological correlations. To address this, we propose DMS-Net, a dual-modal multi-scale Siamese network for binocular fundus image classification. Our framework leverages weight-shared Siamese ResNet-152 backbones to extract deep semantic features from paired fundus images. To tackle challenges such as lesion boundary ambiguity and scattered pathological distributions, we introduce a Multi-Scale Context-Aware Module (MSCAM) that integrates adaptive pooling and attention mechanisms for multi-resolution feature aggregation. Additionally, a Dual-Modal Feature Fusion (DMFF) module enhances cross-modal interaction through spatial-semantic recalibration and bidirectional attention, effectively combining global context and local edge features. Evaluated on the ODIR-5K dataset, DMS-Net achieves state-of-the-art performance with 80.5% accuracy, 86.1% recall, and 83.8% Cohen's kappa, demonstrating superior capability in detecting symmetric pathologies and advancing clinical decision-making for ocular diseases.
Related papers
- Towards a Multimodal MRI-Based Foundation Model for Multi-Level Feature Exploration in Segmentation, Molecular Subtyping, and Grading of Glioma [0.2796197251957244]
Multi-Task S-UNETR (MTSUNET) model is a novel foundation-based framework built on the BrainSegFounder model.<n>It simultaneously performs glioma segmentation, histological subtyping and neuroimaging subtyping.<n>It shows significant potential for advancing noninvasive, personalized glioma management by improving predictive accuracy and interpretability.
arXiv Detail & Related papers (2025-03-10T01:27:09Z) - Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology [6.418265127069878]
We propose the use of omic embeddings during early and late fusion to capture complementary information from local (patch-level) to global (slide-level) interactions.<n>This dual fusion strategy enhances interpretability and classification performance, highlighting its potential for clinical diagnostics.
arXiv Detail & Related papers (2024-11-26T13:25:53Z) - Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space Model [45.682311387979944]
We propose the first Serpentine Mamba (Serp-Mamba) network to address this challenging task.
We first devise a Serpentine Interwoven Adaptive (SIA) scan mechanism, which scans UWF-SLO images along curved vessel structures in a snake-like crawling manner.
Second, we propose an Ambiguity-Driven Dual Recalibration module to address the category imbalance problem intensified by high-resolution images.
arXiv Detail & Related papers (2024-09-06T15:40:47Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Affinity Feature Strengthening for Accurate, Complete and Robust Vessel
Segmentation [48.638327652506284]
Vessel segmentation is crucial in many medical image applications, such as detecting coronary stenoses, retinal vessel diseases and brain aneurysms.
We present a novel approach, the affinity feature strengthening network (AFN), which jointly models geometry and refines pixel-wise segmentation features using a contrast-insensitive, multiscale affinity approach.
arXiv Detail & Related papers (2022-11-12T05:39:17Z) - RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional
Network for Retinal OCT Fluid Segmentation [3.57686754209902]
Quantification of retinal fluids is necessary for OCT-guided treatment management.
New convolutional neural architecture named RetiFluidNet is proposed for multi-class retinal fluid segmentation.
Model benefits from hierarchical representation learning of textural, contextual, and edge features.
arXiv Detail & Related papers (2022-09-26T07:18:00Z) - Superresolution and Segmentation of OCT scans using Multi-Stage
adversarial Guided Attention Training [18.056525121226862]
We propose the multi-stage & multi-discriminatory generative adversarial network (MultiSDGAN) to translate OCT scans in high-resolution segmentation labels.
We evaluate and compare various combinations of channel and spatial attention to the MultiSDGAN architecture to extract more powerful feature maps.
Our results demonstrate relative improvements of 21.44% and 19.45% on the Dice coefficient and SSIM, respectively.
arXiv Detail & Related papers (2022-06-10T00:26:55Z) - InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal
Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded.
We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z) - Multi-Modal Multi-Instance Learning for Retinal Disease Recognition [10.294738095942812]
We aim to build a deep neural network that recognizes multiple vision-threatening diseases for the given case.
As both data acquisition and manual labeling are extremely expensive in the medical domain, the network has to be relatively lightweight.
arXiv Detail & Related papers (2021-09-25T08:16:47Z) - Cross-Modality Brain Tumor Segmentation via Bidirectional
Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme.
Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor.
The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.