U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV
- URL: http://arxiv.org/abs/2507.11415v1
- Date: Tue, 15 Jul 2025 15:40:17 GMT
- Title: U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV
- Authors: Hongbo Ye, Fenghe Tang, Peiang Zhao, Zhen Huang, Dexin Zhao, Minghao Bian, S. Kevin Zhou,
- Abstract summary: We propose U-RWKV, a novel framework for efficient long-range modeling at O(N) computational cost.<n>The framework introduces two key innovations: the Direction-Adaptive RWKV Module and the Stage-Adaptive Squeeze-and-Excitation Module.<n>Experiments demonstrate that U-RWKV achieves state-of-the-art segmentation performance with high computational efficiency.
- Score: 13.528706926224114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Achieving equity in healthcare accessibility requires lightweight yet high-performance solutions for medical image segmentation, particularly in resource-limited settings. Existing methods like U-Net and its variants often suffer from limited global Effective Receptive Fields (ERFs), hindering their ability to capture long-range dependencies. To address this, we propose U-RWKV, a novel framework leveraging the Recurrent Weighted Key-Value(RWKV) architecture, which achieves efficient long-range modeling at O(N) computational cost. The framework introduces two key innovations: the Direction-Adaptive RWKV Module(DARM) and the Stage-Adaptive Squeeze-and-Excitation Module(SASE). DARM employs Dual-RWKV and QuadScan mechanisms to aggregate contextual cues across images, mitigating directional bias while preserving global context and maintaining high computational efficiency. SASE dynamically adapts its architecture to different feature extraction stages, balancing high-resolution detail preservation and semantic relationship capture. Experiments demonstrate that U-RWKV achieves state-of-the-art segmentation performance with high computational efficiency, offering a practical solution for democratizing advanced medical imaging technologies in resource-constrained environments. The code is available at https://github.com/hbyecoding/U-RWKV.
Related papers
- URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration [22.746234919635018]
We introduce a Unified Receptance Weighted Key Value (URWKV) model with multi-state perspective.<n>We customize the core URWKV block to perceive and analyze complex degradations by leveraging multiple intra- and inter-stage states.<n>In comparison to state-of-the-art models, our URWKV model achieves superior performance on various benchmarks.
arXiv Detail & Related papers (2025-05-29T04:17:09Z) - RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation [39.11918061481855]
We propose RWKV-UNet, a novel model that integrates the RWKV structure into the U-Net architecture.<n>This integration enhances the model's ability to capture long-range dependencies and improve contextual understanding.<n>We show that RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation.
arXiv Detail & Related papers (2025-01-14T22:03:00Z) - Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration [47.26304397935705]
Image restoration aims to recover high-quality images from degraded inputs.<n>Existing methods lack a unified training benchmark for iterations and configurations.<n>We introduce a large-scale IR dataset called ReSyn, which employs a novel image filtering method based on image complexity.
arXiv Detail & Related papers (2024-12-05T02:11:51Z) - Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV [15.585071228529731]
We propose Restore-RWKV, the first RWKV-based model for medical image restoration.<n>We present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity.<n>Experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks.
arXiv Detail & Related papers (2024-07-14T12:22:05Z) - LeRF: Learning Resampling Function for Adaptive and Efficient Image Interpolation [64.34935748707673]
Recent deep neural networks (DNNs) have made impressive progress in performance by introducing learned data priors.
We propose a novel method of Learning Resampling (termed LeRF) which takes advantage of both the structural priors learned by DNNs and the locally continuous assumption.
LeRF assigns spatially varying resampling functions to input image pixels and learns to predict the shapes of these resampling functions with a neural network.
arXiv Detail & Related papers (2024-07-13T16:09:45Z) - Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models [33.372947082734946]
This paper introduces a series of architectures adapted from the RWKV model used in the NLP, with requisite modifications tailored for diffusion model applied to image generation tasks.
Our model is designed to efficiently handle patchnified inputs in a sequence with extra conditions, while also scaling up effectively.
Its distinctive advantage manifests in its reduced spatial aggregation complexity, rendering it exceptionally adept at processing high-resolution images.
arXiv Detail & Related papers (2024-04-06T02:54:35Z) - Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures [96.00848293994463]
This paper introduces Vision-RWKV, a model adapted from the RWKV model used in the NLP field.<n>Our model is designed to efficiently handle sparse inputs and demonstrate robust global processing capabilities.<n>Our evaluations demonstrate that VRWKV surpasses ViT's performance in image classification and has significantly faster speeds and lower memory usage.
arXiv Detail & Related papers (2024-03-04T18:46:20Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Image-specific Convolutional Kernel Modulation for Single Image
Super-resolution [85.09413241502209]
In this issue, we propose a novel image-specific convolutional modulation kernel (IKM)
We exploit the global contextual information of image or feature to generate an attention weight for adaptively modulating the convolutional kernels.
Experiments on single image super-resolution show that the proposed methods achieve superior performances over state-of-the-art methods.
arXiv Detail & Related papers (2021-11-16T11:05:10Z) - Video Face Super-Resolution with Motion-Adaptive Feedback Cell [90.73821618795512]
Video super-resolution (VSR) methods have recently achieved a remarkable success due to the development of deep convolutional neural networks (CNN)
In this paper, we propose a Motion-Adaptive Feedback Cell (MAFC), a simple but effective block, which can efficiently capture the motion compensation and feed it back to the network in an adaptive way.
arXiv Detail & Related papers (2020-02-15T13:14:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.