WMKA-Net: A Weighted Multi-Kernel Attention Network for Retinal Vessel Segmentation
- URL: http://arxiv.org/abs/2504.14888v4
- Date: Mon, 29 Sep 2025 13:20:44 GMT
- Title: WMKA-Net: A Weighted Multi-Kernel Attention Network for Retinal Vessel Segmentation
- Authors: Xinran Xu, Yuliang Ma, Sifu Cai, Ming Meng, Qiang Lv, Ruoyan Shi,
- Abstract summary: This study proposes a dual-stage solution to address the issues of insufficient multi-scale feature fusion, disruption of contextual continuity, and noise interference.<n>The first stage employs a Multi-Scale Fusion Module (RMS) that uses hierarchical adaptive convolution to dynamically merge cross-scale features from capillaries to main vessels.<n>The second stage introduces a Vascular-Oriented Attention Mechanism, which models long-distance vascular continuity through an axial pathway.
- Score: 0.48536814705421105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retinal vessel segmentation is crucial for intelligent ophthalmic diagnosis, yet it faces three major challenges: insufficient multi-scale feature fusion, disruption of contextual continuity, and noise interference. This study proposes a dual-stage solution to address these issues. The first stage employs a Reversible Multi-Scale Fusion Module (RMS) that uses hierarchical adaptive convolution to dynamically merge cross-scale features from capillaries to main vessels, self-adaptively calibrating feature biases. The second stage introduces a Vascular-Oriented Attention Mechanism, which models long-distance vascular continuity through an axial pathway and enhances the capture of topological key nodes, such as bifurcation points, via a dedicated bifurcation attention pathway. The synergistic operation of these two pathways effectively restores the continuity of vascular structures and improves the segmentation accuracy of complex vascular networks. Systematic experiments on the DRIVE, STARE, and CHASE-DB1 datasets demonstrate that WMKA-Net achieves an accuracy of 0.9909, sensitivity of 0.9198, and specificity of 0.9953, significantly outperforming existing methods. This model provides an efficient, precise, and robust intelligent solution for the early screening of diabetic retinopathy.
Related papers
- Context-Aware Asymmetric Ensembling for Interpretable Retinopathy of Prematurity Screening via Active Query and Vascular Attention [1.8420107091891775]
Retinopathy of Prematurity (ROP) is among the major causes of preventable childhood blindness.<n>Current deep learning models depend heavily on large private datasets and passive multimodal fusion.<n>We propose the Context-Aware Asymmetric Ensemble Model (CAA Ensemble) that simulates clinical reasoning through two specialized streams.
arXiv Detail & Related papers (2026-02-05T02:06:26Z) - A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler [49.03919553747297]
We propose an AI-powered, real-time CoW auto-segmentation system capable of efficiently capturing cerebral arteries.<n>No prior studies have explored AI-driven cerebrovascular segmentation using Transcranial Color-coded Doppler (TCCD)<n>The proposed AAW-YOLO demonstrated strong performance in segmenting both ipsilateral and contralateral CoW vessels.
arXiv Detail & Related papers (2025-08-19T14:41:22Z) - A Semantic Segmentation Algorithm for Pleural Effusion Based on DBIF-AUNet [22.657295396752023]
Pleural effusion semantic segmentation can significantly enhance the accuracy and timeliness of clinical diagnosis and treatment.<n>Existing methods often struggle with diverse image variations and complex edges.<n>We propose the Dual-Branch Interactive Fusion Attention model (DBIF-AUNet) to address these challenges.
arXiv Detail & Related papers (2025-08-08T10:14:51Z) - Topology-Constrained Learning for Efficient Laparoscopic Liver Landmark Detection [46.2391319253146]
Liver landmarks provide crucial anatomical guidance to the surgeon during laparoscopic liver surgery.<n>TopoNet is a novel topology-constrained learning framework for laparoscopic liver landmark detection.<n>Our framework adopts a snake-CNN dual-path encoder to simultaneously capture detailed RGB texture information and depth-informed topological structures.
arXiv Detail & Related papers (2025-07-01T07:35:36Z) - MSCA-Net:Multi-Scale Context Aggregation Network for Infrared Small Target Detection [0.0]
This paper proposes a novel network architecture named MSCA-Net, which integrates three key components.<n>MSEDA employs a multi-scale feature fusion attention mechanism to adaptively aggregate information across different scales.<n>PCBAM captures the correlation between global and local features through a correlation matrix-based strategy.
arXiv Detail & Related papers (2025-03-21T14:42:31Z) - Reflecting Topology Consistency and Abnormality via Learnable Attentions for Airway Labeling [19.269806092729468]
airway anatomical labeling is crucial for clinicians to identify and navigate complex bronchial structures during bronchoscopy.<n>Previous methods are prone to generate inconsistent predictions.<n>This paper proposes a novel method that enhances topological consistency and improves the detection of abnormal airway branches.
arXiv Detail & Related papers (2024-10-31T12:04:30Z) - KaLDeX: Kalman Filter based Linear Deformable Cross Attention for Retina Vessel Segmentation [46.57880203321858]
We propose a novel network (KaLDeX) for vascular segmentation leveraging a Kalman filter based linear deformable cross attention (LDCA) module.
Our approach is based on two key components: Kalman filter (KF) based linear deformable convolution (LD) and cross-attention (CA) modules.
The proposed method is evaluated on retinal fundus image datasets (DRIVE, CHASE_BD1, and STARE) as well as the 3mm and 6mm of the OCTA-500 dataset.
arXiv Detail & Related papers (2024-10-28T16:00:42Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - MCA: Moment Channel Attention Networks [10.780493635885225]
We investigate the statistical moments of feature maps within a neural network.
Our findings highlight the critical role of high-order moments in enhancing model capacity.
We propose the Moment Channel Attention (MCA) framework, which efficiently incorporates multiple levels of moment-based information.
arXiv Detail & Related papers (2024-03-04T04:02:59Z) - FS-Net: Full Scale Network and Adaptive Threshold for Improving Extraction of Micro-Retinal Vessel Structures [4.507779218329283]
Segmenting retinal vessels presents unique challenges.<n>Recent neural network approaches struggle to balance local and global properties.<n>We propose a comprehensive micro-vessel extraction mechanism based on an encoder-decoder neural network architecture.
arXiv Detail & Related papers (2023-11-14T10:32:17Z) - MAF-Net: Multiple attention-guided fusion network for fundus vascular
image segmentation [1.3295074739915493]
We propose a multiple attention-guided fusion network (MAF-Net) to accurately detect blood vessels in retinal fundus images.
Traditional UNet-based models may lose partial information due to explicitly modeling long-distance dependencies.
We show that our method produces satisfactory results compared to some state-of-the-art methods.
arXiv Detail & Related papers (2023-05-05T15:22:20Z) - Fuzzy Attention Neural Network to Tackle Discontinuity in Airway
Segmentation [67.19443246236048]
Airway segmentation is crucial for the examination, diagnosis, and prognosis of lung diseases.
Some small-sized airway branches (e.g., bronchus and terminaloles) significantly aggravate the difficulty of automatic segmentation.
This paper presents an efficient method for airway segmentation, comprising a novel fuzzy attention neural network and a comprehensive loss function.
arXiv Detail & Related papers (2022-09-05T16:38:13Z) - Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase
Classification Using EEG [56.155331323304]
Deep learning based electroencephalogram channels' feature level fusion is carried out in this work.
Channel selection, fusion, and classification procedures were optimized by two optimization algorithms.
arXiv Detail & Related papers (2021-12-18T14:17:49Z) - A Discriminative Channel Diversification Network for Image
Classification [21.049734250642974]
We propose a light-weight and effective attention module, called channel diversification block, to enhance the global context.
Unlike other channel attention mechanisms, the proposed module focuses on the most discriminative features.
Experiments on CIFAR-10, SVHN, and Tiny-ImageNet datasets demonstrate that the proposed module improves the performance of the baseline networks by a margin of 3% on average.
arXiv Detail & Related papers (2021-12-10T23:00:53Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Encoder Fusion Network with Co-Attention Embedding for Referring Image
Segmentation [87.01669173673288]
We propose an encoder fusion network (EFN), which transforms the visual encoder into a multi-modal feature learning network.
A co-attention mechanism is embedded in the EFN to realize the parallel update of multi-modal features.
The experiment results on four benchmark datasets demonstrate that the proposed approach achieves the state-of-the-art performance without any post-processing.
arXiv Detail & Related papers (2021-05-05T02:27:25Z) - Multi-stage Attention ResU-Net for Semantic Segmentation of
Fine-Resolution Remote Sensing Images [9.398340832493457]
We propose a Linear Attention Mechanism (LAM) to address this issue.
LAM is approximately equivalent to dot-product attention with computational efficiency.
We design a Multi-stage Attention ResU-Net for semantic segmentation from fine-resolution remote sensing images.
arXiv Detail & Related papers (2020-11-29T07:24:21Z) - Multi-Task Neural Networks with Spatial Activation for Retinal Vessel
Segmentation and Artery/Vein Classification [49.64863177155927]
We propose a multi-task deep neural network with spatial activation mechanism to segment full retinal vessel, artery and vein simultaneously.
The proposed network achieves pixel-wise accuracy of 95.70% for vessel segmentation, and A/V classification accuracy of 94.50%, which is the state-of-the-art performance for both tasks.
arXiv Detail & Related papers (2020-07-18T05:46:47Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.