Med-URWKV: Pure RWKV With ImageNet Pre-training For Medical Image Segmentation
- URL: http://arxiv.org/abs/2506.10858v1
- Date: Thu, 12 Jun 2025 16:19:18 GMT
- Title: Med-URWKV: Pure RWKV With ImageNet Pre-training For Medical Image Segmentation
- Authors: Zhenhuan Zhou,
- Abstract summary: Medical image segmentation is a fundamental and key technology in computer-aided diagnosis and treatment.<n>In this paper, we propose Med-URWKV, a pure RWKV-based architecture built upon the U-Net framework.<n>We show that Med-URWKV achieves comparable or even superior segmentation performance compared to other carefully optimized RWKV models trained from scratch.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Medical image segmentation is a fundamental and key technology in computer-aided diagnosis and treatment. Previous methods can be broadly classified into three categories: convolutional neural network (CNN) based, Transformer based, and hybrid architectures that combine both. However, each of them has its own limitations, such as restricted receptive fields in CNNs or the computational overhead caused by the quadratic complexity of Transformers. Recently, the Receptance Weighted Key Value (RWKV) model has emerged as a promising alternative for various vision tasks, offering strong long-range modeling capabilities with linear computational complexity. Some studies have also adapted RWKV to medical image segmentation tasks, achieving competitive performance. However, most of these studies focus on modifications to the Vision-RWKV (VRWKV) mechanism and train models from scratch, without exploring the potential advantages of leveraging pre-trained VRWKV models for medical image segmentation tasks. In this paper, we propose Med-URWKV, a pure RWKV-based architecture built upon the U-Net framework, which incorporates ImageNet-based pretraining to further explore the potential of RWKV in medical image segmentation tasks. To the best of our knowledge, Med-URWKV is the first pure RWKV segmentation model in the medical field that can directly reuse a large-scale pre-trained VRWKV encoder. Experimental results on seven datasets demonstrate that Med-URWKV achieves comparable or even superior segmentation performance compared to other carefully optimized RWKV models trained from scratch. This validates the effectiveness of using a pretrained VRWKV encoder in enhancing model performance. The codes will be released.
Related papers
- RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation [39.11918061481855]
We propose RWKV-UNet, a novel model that integrates the RWKV structure into the U-Net architecture.<n>This integration enhances the model's ability to capture long-range dependencies and improve contextual understanding.<n>We show that RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation.
arXiv Detail & Related papers (2025-01-14T22:03:00Z) - Efficient MedSAMs: Segment Anything in Medical Images on Laptop [69.28565867103542]
We organized the first international competition dedicated to promptable medical image segmentation.<n>The top teams developed lightweight segmentation foundation models and implemented an efficient inference pipeline.<n>The best-performing algorithms have been incorporated into the open-source software with a user-friendly interface to facilitate clinical adoption.
arXiv Detail & Related papers (2024-12-20T17:33:35Z) - Embeddings are all you need! Achieving High Performance Medical Image Classification through Training-Free Embedding Analysis [0.0]
Developing artificial intelligence (AI) and machine learning (ML) models for medical imaging typically involves extensive training and testing on large datasets.<n>We investigated the feasibility of replacing conventional training procedures with an embedding-based approach.
arXiv Detail & Related papers (2024-12-12T16:59:37Z) - Evaluating Pre-trained Convolutional Neural Networks and Foundation Models as Feature Extractors for Content-based Medical Image Retrieval [0.37478492878307323]
Content-based medical image retrieval (CBMIR) depends on image features, which can be extracted automatically or semi-automatically.<n>In this study, we used several pre-trained feature extractors from well-known pre-trained convolutional neural networks (CNNs) and pre-trained foundation models.<n>Our results show that, overall, for the 2D datasets, foundation models deliver superior performance by a large margin compared to CNNs.<n>Our findings confirm that while using larger image sizes (especially for 2D datasets) yields slightly better performance, competitive CBMIR performance can still be achieved even with smaller image
arXiv Detail & Related papers (2024-09-14T13:07:30Z) - Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV [15.585071228529731]
We propose Restore-RWKV, the first RWKV-based model for medical image restoration.<n>We present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity.<n>Experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks.
arXiv Detail & Related papers (2024-07-14T12:22:05Z) - Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures [96.00848293994463]
This paper introduces Vision-RWKV, a model adapted from the RWKV model used in the NLP field.<n>Our model is designed to efficiently handle sparse inputs and demonstrate robust global processing capabilities.<n>Our evaluations demonstrate that VRWKV surpasses ViT's performance in image classification and has significantly faster speeds and lower memory usage.
arXiv Detail & Related papers (2024-03-04T18:46:20Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Pick the Best Pre-trained Model: Towards Transferability Estimation for
Medical Image Segmentation [20.03177073703528]
Transfer learning is a critical technique in training deep neural networks for the challenging medical image segmentation task.
We propose a new Transferability Estimation (TE) method for medical image segmentation.
Our method surpasses all current algorithms for transferability estimation in medical image segmentation.
arXiv Detail & Related papers (2023-07-22T01:58:18Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Attentive Symmetric Autoencoder for Brain MRI Segmentation [56.02577247523737]
We propose a novel Attentive Symmetric Auto-encoder based on Vision Transformer (ViT) for 3D brain MRI segmentation tasks.
In the pre-training stage, the proposed auto-encoder pays more attention to reconstruct the informative patches according to the gradient metrics.
Experimental results show that our proposed attentive symmetric auto-encoder outperforms the state-of-the-art self-supervised learning methods and medical image segmentation models.
arXiv Detail & Related papers (2022-09-19T09:43:19Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.