A Novel Hybrid Approach for Retinal Vessel Segmentation with Dynamic Long-Range Dependency and Multi-Scale Retinal Edge Fusion Enhancement
- URL: http://arxiv.org/abs/2504.13553v1
- Date: Fri, 18 Apr 2025 08:41:35 GMT
- Title: A Novel Hybrid Approach for Retinal Vessel Segmentation with Dynamic Long-Range Dependency and Multi-Scale Retinal Edge Fusion Enhancement
- Authors: Yihao Ouyang, Xunheng Kuang, Mengjia Xiong, Zhida Wang, Yuanquan Wang,
- Abstract summary: Existing methods struggle with challenges such as multi-scale vessel variability, complex curvatures, and ambiguous boundaries.<n>We propose a novel hybrid framework that synergistically integrates CNNs and Mamba for high-precision retinal vessel segmentation.<n>Our approach achieves state-of-the-art performance, particularly in maintaining vascular continuity and effectively segmenting vessels in low-contrast regions.
- Score: 0.3611754783778107
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate retinal vessel segmentation provides essential structural information for ophthalmic image analysis. However, existing methods struggle with challenges such as multi-scale vessel variability, complex curvatures, and ambiguous boundaries. While Convolutional Neural Networks (CNNs), Transformer-based models and Mamba-based architectures have advanced the field, they often suffer from vascular discontinuities or edge feature ambiguity. To address these limitations, we propose a novel hybrid framework that synergistically integrates CNNs and Mamba for high-precision retinal vessel segmentation. Our approach introduces three key innovations: 1) The proposed High-Resolution Edge Fuse Network is a high-resolution preserving hybrid segmentation framework that combines a multi-scale backbone with the Multi-scale Retina Edge Fusion (MREF) module to enhance edge features, ensuring accurate and robust vessel segmentation. 2) The Dynamic Snake Visual State Space block combines Dynamic Snake Convolution with Mamba to adaptively capture vessel curvature details and long-range dependencies. An improved eight-directional 2D Snake-Selective Scan mechanism and a dynamic weighting strategy enhance the perception of complex vascular topologies. 3) The MREF module enhances boundary precision through multi-scale edge feature aggregation, suppressing noise while emphasizing critical vessel structures across scales. Experiments on three public datasets demonstrate that our method achieves state-of-the-art performance, particularly in maintaining vascular continuity and effectively segmenting vessels in low-contrast regions. This work provides a robust method for clinical applications requiring accurate retinal vessel analysis. The code is available at https://github.com/frank-oy/HREFNet.
Related papers
- Self-Supervised Z-Slice Augmentation for 3D Bio-Imaging via Knowledge Distillation [65.46249968484794]
ZAugNet is a fast, accurate, and self-supervised deep learning method for enhancing z-resolution in biological images.<n>By performing nonlinear distances between consecutive slices, ZAugNet effectively doubles resolution with each iteration.<n>ZAugNet+ is an extended version enabling continuous prediction at arbitrary distances.
arXiv Detail & Related papers (2025-03-05T17:50:35Z) - Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection [70.84835546732738]
RGB-Thermal Salient Object Detection aims to pinpoint prominent objects within aligned pairs of visible and thermal infrared images.<n>Traditional encoder-decoder architectures may not have adequately considered the robustness against noise originating from defective modalities.<n>We propose the ConTriNet, a robust Confluent Triple-Flow Network employing a Divide-and-Conquer strategy.
arXiv Detail & Related papers (2024-12-02T14:44:39Z) - Edge-Enhanced Dilated Residual Attention Network for Multimodal Medical Image Fusion [13.029564509505676]
Multimodal medical image fusion is a crucial task that combines complementary information from different imaging modalities into a unified representation.
While deep learning methods have significantly advanced fusion performance, some of the existing CNN-based methods fall short in capturing fine-grained multiscale and edge features.
We propose a novel CNN-based architecture that addresses these limitations by introducing a Dilated Residual Attention Network Module for effective multiscale feature extraction.
arXiv Detail & Related papers (2024-11-18T18:11:53Z) - TransUNext: towards a more advanced U-shaped framework for automatic vessel segmentation in the fundus image [19.16680702780529]
We propose a more advanced U-shaped architecture for a hybrid Transformer and CNN: TransUNext.
The Global Multi-Scale Fusion (GMSF) module is further introduced to upgrade skip-connections, fuse high-level semantic and low-level detailed information, and eliminate high- and low-level semantic differences.
arXiv Detail & Related papers (2024-11-05T01:44:22Z) - Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space Model [45.682311387979944]
We propose the first Serpentine Mamba (Serp-Mamba) network to address this challenging task.
We first devise a Serpentine Interwoven Adaptive (SIA) scan mechanism, which scans UWF-SLO images along curved vessel structures in a snake-like crawling manner.
Second, we propose an Ambiguity-Driven Dual Recalibration module to address the category imbalance problem intensified by high-resolution images.
arXiv Detail & Related papers (2024-09-06T15:40:47Z) - Enhancing Retinal Vascular Structure Segmentation in Images With a Novel
Design Two-Path Interactive Fusion Module Model [6.392575673488379]
We introduce Swin-Res-Net, a specialized module designed to enhance the precision of retinal vessel segmentation.
Swin-Res-Net utilizes the Swin transformer which uses shifted windows with displacement for partitioning.
Our proposed architecture produces outstanding results, either meeting or surpassing those of other published models.
arXiv Detail & Related papers (2024-03-03T01:36:11Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - Affinity Feature Strengthening for Accurate, Complete and Robust Vessel
Segmentation [48.638327652506284]
Vessel segmentation is crucial in many medical image applications, such as detecting coronary stenoses, retinal vessel diseases and brain aneurysms.
We present a novel approach, the affinity feature strengthening network (AFN), which jointly models geometry and refines pixel-wise segmentation features using a contrast-insensitive, multiscale affinity approach.
arXiv Detail & Related papers (2022-11-12T05:39:17Z) - RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional
Network for Retinal OCT Fluid Segmentation [3.57686754209902]
Quantification of retinal fluids is necessary for OCT-guided treatment management.
New convolutional neural architecture named RetiFluidNet is proposed for multi-class retinal fluid segmentation.
Model benefits from hierarchical representation learning of textural, contextual, and edge features.
arXiv Detail & Related papers (2022-09-26T07:18:00Z) - DepthFormer: Exploiting Long-Range Correlation and Local Information for
Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation.
We propose to leverage the Transformer to model this global context with an effective attention mechanism.
Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.