RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
- URL: http://arxiv.org/abs/2501.08458v1
- Date: Tue, 14 Jan 2025 22:03:00 GMT
- Title: RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
- Authors: Juntao Jiang, Jiangning Zhang, Weixuan Liu, Muxuan Gao, Xiaobin Hu, Xiaoxiao Yan, Feiyue Huang, Yong Liu,
- Abstract summary: We propose RWKV-UNet, a novel model that integrates the RWKV structure into the U-Net architecture.<n>This integration enhances the model's ability to capture long-range dependencies and improve contextual understanding.<n>We show that RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation.
- Score: 39.11918061481855
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In recent years, there have been significant advancements in deep learning for medical image analysis, especially with convolutional neural networks (CNNs) and transformer models. However, CNNs face limitations in capturing long-range dependencies while transformers suffer high computational complexities. To address this, we propose RWKV-UNet, a novel model that integrates the RWKV (Receptance Weighted Key Value) structure into the U-Net architecture. This integration enhances the model's ability to capture long-range dependencies and improve contextual understanding, which is crucial for accurate medical image segmentation. We build a strong encoder with developed inverted residual RWKV (IR-RWKV) blocks combining CNNs and RWKVs. We also propose a Cross-Channel Mix (CCM) module to improve skip connections with multi-scale feature fusion, achieving global channel information integration. Experiments on benchmark datasets, including Synapse, ACDC, BUSI, CVC-ClinicDB, CVC-ColonDB, Kvasir-SEG, ISIC 2017 and GLAS show that RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation. Additionally, smaller variants, RWKV-UNet-S and RWKV-UNet-T, balance accuracy and computational efficiency, making them suitable for broader clinical applications.
Related papers
- Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration [47.26304397935705]
Image restoration aims to recover high-quality images from degraded inputs.<n>Existing methods lack a unified training benchmark for iterations and configurations.<n>We introduce a large-scale IR dataset called ReSyn, which employs a novel image filtering method based on image complexity.
arXiv Detail & Related papers (2024-12-05T02:11:51Z) - TBConvL-Net: A Hybrid Deep Learning Architecture for Robust Medical Image Segmentation [6.013821375459473]
We introduce a novel deep learning architecture for medical image segmentation.
Our proposed model shows consistent improvement over the state of the art on ten publicly available datasets.
arXiv Detail & Related papers (2024-09-05T09:14:03Z) - CSWin-UNet: Transformer UNet with Cross-Shaped Windows for Medical Image Segmentation [22.645013853519]
CSWin-UNet is a novel U-shaped segmentation method that incorporates the CSWin self-attention mechanism into the UNet.
Our empirical evaluations on diverse datasets, including synapse multi-organ CT, cardiac MRI, and skin lesions, demonstrate that CSWin-UNet maintains low model complexity while delivering high segmentation accuracy.
arXiv Detail & Related papers (2024-07-25T14:25:17Z) - Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV [15.585071228529731]
We propose Restore-RWKV, the first RWKV-based model for medical image restoration.<n>We present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity.<n>Experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks.
arXiv Detail & Related papers (2024-07-14T12:22:05Z) - Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like
Architectures [99.20299078655376]
This paper introduces Vision-RWKV, a model adapted from the RWKV model used in the NLP field.
Our model is designed to efficiently handle sparse inputs and demonstrate robust global processing capabilities.
Our evaluations demonstrate that VRWKV surpasses ViT's performance in image classification and has significantly faster speeds and lower memory usage.
arXiv Detail & Related papers (2024-03-04T18:46:20Z) - Transformer-CNN Fused Architecture for Enhanced Skin Lesion Segmentation [0.0]
convolutional neural networks (CNNs) have greatly advanced medical image segmentation.
CNNs have been found to struggle with learning long-range dependencies and capturing global context.
We propose a hybrid architecture that combines the ability of transformers to capture global dependencies with the ability of CNNs to capture low-level spatial details.
arXiv Detail & Related papers (2024-01-10T18:36:14Z) - BRAU-Net++: U-Shaped Hybrid CNN-Transformer Network for Medical Image Segmentation [11.986549780782724]
We propose a hybrid yet effective CNN-Transformer network, named BRAU-Net++, for an accurate medical image segmentation task.
Specifically, BRAU-Net++ uses bi-level routing attention as the core building block to design our u-shaped encoder-decoder structure.
Our proposed approach surpasses other state-of-the-art methods including its baseline: BRAU-Net.
arXiv Detail & Related papers (2024-01-01T10:49:09Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.