Related papers: MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation

MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation

URL: http://arxiv.org/abs/2509.21265v1
Date: Thu, 25 Sep 2025 14:56:59 GMT
Title: MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation
Authors: Xinyu Liu, Guolei Sun, Cheng Wang, Yixuan Yuan, Ender Konukoglu,
Abstract summary: Low-resolution (LR) medical videos present unique challenges for video super-resolution (VSR) models.<n>We propose MedVSR, a tailored framework for medical VSR.<n>We show that MedVSR significantly outperforms existing VSR models in reconstruction performance and efficiency.
Score: 63.38824041721275
License: http://creativecommons.org/licenses/by/4.0/
Abstract: High-resolution (HR) medical videos are vital for accurate diagnosis, yet are hard to acquire due to hardware limitations and physiological constraints. Clinically, the collected low-resolution (LR) medical videos present unique challenges for video super-resolution (VSR) models, including camera shake, noise, and abrupt frame transitions, which result in significant optical flow errors and alignment difficulties. Additionally, tissues and organs exhibit continuous and nuanced structures, but current VSR models are prone to introducing artifacts and distorted features that can mislead doctors. To this end, we propose MedVSR, a tailored framework for medical VSR. It first employs Cross State-Space Propagation (CSSP) to address the imprecise alignment by projecting distant frames as control matrices within state-space models, enabling the selective propagation of consistent and informative features to neighboring frames for effective alignment. Moreover, we design an Inner State-Space Reconstruction (ISSR) module that enhances tissue structures and reduces artifacts with joint long-range spatial feature learning and large-kernel short-range information aggregation. Experiments across four datasets in diverse medical scenarios, including endoscopy and cataract surgeries, show that MedVSR significantly outperforms existing VSR models in reconstruction performance and efficiency. Code released at https://github.com/CUHK-AIM-Group/MedVSR.

Related papers

Rethinking Diffusion Model-Based Video Super-Resolution: Leveraging Dense Guidance from Aligned Features [51.5076190312734]
Video Super-Resolution approaches suffer from error accumulation, spatial artifacts, and a trade-off between perceptual quality and fidelity.<n>We propose a novelly Guided diffusion model with Aligned Features for Video Super-Resolution (DGAF-VSR)<n>Experiments on synthetic and real-world datasets demonstrate that DGAF-VSR surpasses state-of-the-art methods in key aspects of VSR.
arXiv Detail & Related papers (2025-11-21T03:40:45Z)
HRVVS: A High-resolution Video Vasculature Segmentation Network via Hierarchical Autoregressive Residual Priors [18.951871257229055]
We introduce a high quality frame-by-frame annotated hepatic vasculature dataset containing 35 long hepatectomy videos and 11442 high-resolution frames.<n>We propose a novel high-resolution video vasculature segmentation network, dubbed as HRVVS.<n>Our proposed HRVVS significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2025-07-30T09:57:38Z)
FaRMamba: Frequency-based learning and Reconstruction aided Mamba for Medical Segmentation [3.5790602918760586]
Vision Mamba employs one-dimensional causal state-space recurrence to efficiently model global dependencies.<n>Its patch tokenization and 1D serialization disrupt local pixel adjacency and impose a low-pass filtering effect.<n>We propose FaRMamba, a novel extension that explicitly addresses LHICD and 2D-SSD through two complementary modules.
arXiv Detail & Related papers (2025-07-26T20:41:53Z)
UltraVSR: Achieving Ultra-Realistic Video Super-Resolution with Efficient One-Step Diffusion Space [46.43409853027655]
Diffusion models have shown great potential in generating realistic image detail.<n>Adapting these models to video super-resolution (VSR) remains challenging due to their inherentity and lack of temporal modeling.<n>We propose UltraVSR, a novel framework that enables ultra-realistic and temporally-coherent VSR through an efficient one-step diffusion space.
arXiv Detail & Related papers (2025-05-26T13:19:27Z)
Decoupling Multi-Contrast Super-Resolution: Pairing Unpaired Synthesis with Implicit Representations [6.255537948555454]
Multi-Contrast Super-Resolution techniques can boost the quality of their low-resolution counterparts.<n>Existing MCSR methods often assume fixed resolution settings and all require large, perfectly paired training datasets.<n>We propose a novel Modular Multi-Contrast Super-Resolution framework that eliminates the need for paired training data and supports arbitrary upscaling.
arXiv Detail & Related papers (2025-05-09T07:48:52Z)
RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation [70.79072961974141]
We propose RWKV-UNet, a novel model that integrates the RWKV structure into the U-Net architecture.<n>This integration enhances the model's ability to capture long-range dependencies and to improve contextual understanding.<n> Experiments on 11 benchmark datasets show that the RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation tasks.
arXiv Detail & Related papers (2025-01-14T22:03:00Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.<n>Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos [79.50191812646125]
Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training. We adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue. We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch. This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information
arXiv Detail & Related papers (2024-03-18T19:13:02Z)
K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality Assessment [71.27193056354741]
The problem of how to assess cross-modality medical image synthesis has been largely unexplored. We propose a new metric K-CROSS to spur progress on this challenging problem. K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location.
arXiv Detail & Related papers (2023-07-10T01:26:48Z)
On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input. DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases. We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z)
Cross-Modal Causal Intervention for Medical Report Generation [107.76649943399168]
Radiology Report Generation (RRG) is essential for computer-aided diagnosis and medication guidance.<n> generating accurate lesion descriptions remains challenging due to spurious correlations from visual-linguistic biases.<n>We propose a two-stage framework named CrossModal Causal Representation Learning (CMCRL)<n> Experiments on IU-Xray and MIMIC-CXR show that our CMCRL pipeline significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-03-16T07:23:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.