Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection
- URL: http://arxiv.org/abs/2503.19329v1
- Date: Tue, 25 Mar 2025 03:44:57 GMT
- Title: Wavelet-based Global-Local Interaction Network with Cross-Attention for Multi-View Diabetic Retinopathy Detection
- Authors: Yongting Hu, Yuxin Lin, Chengliang Liu, Xiaoling Luo, Xiaoyan Dou, Qihao Xu, Yong Xu,
- Abstract summary: We propose a novel method to overcome the challenges of difficult lesion information learning and inadequate multi-view fusion.<n> Specifically, we introduce a two-branch network to obtain both local lesion features and their global dependencies.<n>We present a cross-view fusion module to improve multi-view fusion and reduce redundancy.
- Score: 11.977013678444273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-view diabetic retinopathy (DR) detection has recently emerged as a promising method to address the issue of incomplete lesions faced by single-view DR. However, it is still challenging due to the variable sizes and scattered locations of lesions. Furthermore, existing multi-view DR methods typically merge multiple views without considering the correlations and redundancies of lesion information across them. Therefore, we propose a novel method to overcome the challenges of difficult lesion information learning and inadequate multi-view fusion. Specifically, we introduce a two-branch network to obtain both local lesion features and their global dependencies. The high-frequency component of the wavelet transform is used to exploit lesion edge information, which is then enhanced by global semantic to facilitate difficult lesion learning. Additionally, we present a cross-view fusion module to improve multi-view fusion and reduce redundancy. Experimental results on large public datasets demonstrate the effectiveness of our method. The code is open sourced on https://github.com/HuYongting/WGLIN.
Related papers
- Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation [30.697291934309206]
multimodal data is rare in real-world applications due to a lack of medical equipment and concerns about data privacy.<n>Traditional deep learning methods typically address these issues by learning representations in latent space.<n>Authors propose the Essence-Point and Disentangle Representation Learning (EDRL) strategy, which integrates a self-distillation mechanism into an end-to-end framework.
arXiv Detail & Related papers (2025-03-07T10:58:38Z) - Scale-aware Super-resolution Network with Dual Affinity Learning for
Lesion Segmentation from Medical Images [50.76668288066681]
We present a scale-aware super-resolution network to adaptively segment lesions of various sizes from low-resolution medical images.
Our proposed network achieved consistent improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-30T14:25:55Z) - Multi-Level Global Context Cross Consistency Model for Semi-Supervised
Ultrasound Image Segmentation with Diffusion Model [0.0]
We propose a framework that uses images generated by a Latent Diffusion Model (LDM) as unlabeled images for semi-supervised learning.
Our approach enables the effective transfer of probability distribution knowledge to the segmentation network, resulting in improved segmentation accuracy.
arXiv Detail & Related papers (2023-05-16T14:08:24Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - RTNet: Relation Transformer Network for Diabetic Retinopathy
Multi-lesion Segmentation [10.643730843316948]
We find that certain lesions are closed to specific vessels and present relative patterns to each other.
A self-attention transformer exploits global dependencies among lesion features, while a cross-attention transformer allows interactions between lesion and vessel features.
By integrating the above blocks of dual-branches, our network segments the four kinds of lesions simultaneously.
arXiv Detail & Related papers (2022-01-26T16:19:04Z) - FetReg: Placental Vessel Segmentation and Registration in Fetoscopy
Challenge Dataset [57.30136148318641]
Fetoscopy laser photocoagulation is a widely used procedure for the treatment of Twin-to-Twin Transfusion Syndrome (TTTS)
This may lead to increased procedural time and incomplete ablation, resulting in persistent TTTS.
Computer-assisted intervention may help overcome these challenges by expanding the fetoscopic field of view through video mosaicking and providing better visualization of the vessel network.
We present a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms for the fetal environment with a focus on creating drift-free mosaics from long duration fetoscopy videos.
arXiv Detail & Related papers (2021-06-10T17:14:27Z) - Cross-Modality Brain Tumor Segmentation via Bidirectional
Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme.
Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor.
The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z) - Universal Lesion Detection by Learning from Multiple Heterogeneously
Labeled Datasets [23.471903581482668]
We learn a multi-head multi-task lesion detector using all datasets and generate lesion proposals on DeepLesion.
We discover suspicious but unannotated lesions using knowledge transfer from single-type lesion detectors.
Our method outperforms the current state-of-the-art approach by 29% in the metric of average sensitivity.
arXiv Detail & Related papers (2020-05-28T02:56:00Z) - Pseudo-Labeling for Small Lesion Detection on Diabetic Retinopathy
Images [12.49381528673824]
Diabetic retinopathy (DR) is a primary cause of blindness in working-age people worldwide.
About 3 to 4 million people with diabetes become blind because of DR every year.
Diagnosis of DR through color fundus images is a common approach to mitigate such problem.
arXiv Detail & Related papers (2020-03-26T17:13:48Z) - Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement
and Gated Fusion [71.87627318863612]
We propose a novel multimodal segmentation framework which is robust to the absence of imaging modalities.
Our network uses feature disentanglement to decompose the input modalities into the modality-specific appearance code.
We validate our method on the important yet challenging multimodal brain tumor segmentation task with the BRATS challenge dataset.
arXiv Detail & Related papers (2020-02-22T14:32:04Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.