Physics-Driven Autoregressive State Space Models for Medical Image Reconstruction
- URL: http://arxiv.org/abs/2412.09331v2
- Date: Tue, 08 Jul 2025 11:21:13 GMT
- Title: Physics-Driven Autoregressive State Space Models for Medical Image Reconstruction
- Authors: Bilal Kabas, Fuat Arslan, Valiyeh A. Nezhad, Saban Ozturk, Emine U. Saritas, Tolga Çukur,
- Abstract summary: We introduce a physics-driven autoregressive state-space model (MambaRoll) for medical image reconstruction.<n>MambaRoll consistently outperforms state-of-the-art data-driven and physics-driven methods.
- Score: 5.208643222679356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical image reconstruction from undersampled acquisitions is an ill-posed problem involving inversion of the imaging operator linking measurement and image domains. Physics-driven (PD) models have gained prominence in reconstruction tasks due to their desirable performance and generalization. These models jointly promote data fidelity and artifact suppression, typically by combining data-consistency mechanisms with learned network modules. Artifact suppression depends on the network's ability to disentangle artifacts from true tissue signals, both of which can exhibit contextual structure across diverse spatial scales. Convolutional neural networks (CNNs) are strong in capturing local correlations, albeit relatively insensitive to non-local context. While transformers promise to alleviate this limitation, practical implementations frequently involve design compromises to reduce computational cost by balancing local and non-local sensitivity, occasionally resulting in performance comparable to or trailing that of CNNs. To enhance contextual sensitivity without incurring high complexity, we introduce a novel physics-driven autoregressive state-space model (MambaRoll) for medical image reconstruction. In each cascade of its unrolled architecture, MambaRoll employs a physics-driven state-space module (PD-SSM) to aggregate contextual features efficiently at a given spatial scale, and autoregressively predicts finer-scale feature maps conditioned on coarser-scale features to capture multi-scale context. Learning across scales is further enhanced via a deep multi-scale decoding (DMSD) loss tailored to the autoregressive prediction task. Demonstrations on accelerated MRI and sparse-view CT reconstructions show that MambaRoll consistently outperforms state-of-the-art data-driven and physics-driven methods based on CNN, transformer, and SSM backbones.
Related papers
- Compressive Imaging Reconstruction via Tensor Decomposed Multi-Resolution Grid Encoding [50.54887630778593]
Compressive imaging (CI) reconstruction aims to recover high-dimensional images from low-dimensional measurements compressed.<n>Existing unsupervised representations may struggle to achieve a desired balance between representation ability and efficiency.<n>We propose Decomposed multi-resolution Grid encoding (GridTD), an unsupervised continuous representation framework for CI reconstruction.
arXiv Detail & Related papers (2025-07-10T12:36:20Z) - Self-Consistent Nested Diffusion Bridge for Accelerated MRI Reconstruction [22.589087990596887]
We focus on the underexplored task of magnitude-image-based MRI reconstruction.
Recent advancements in diffusion models, particularly denoising diffusion probabilistic models, have demonstrated strong capabilities in modeling image priors.
We propose a novel Self-Consistent Nested Diffusion Bridge (SC-NDB) framework that models accelerated MRI reconstruction.
arXiv Detail & Related papers (2024-12-13T09:35:34Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation [6.673169053236727]
We propose MambaClinix, a novel U-shaped architecture for medical image segmentation.
MambaClinix integrates a hierarchical gated convolutional network with Mamba in an adaptive stage-wise framework.
Our results show that MambaClinix achieves high segmentation accuracy while maintaining low model complexity.
arXiv Detail & Related papers (2024-09-19T07:51:14Z) - Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding.<n>Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z) - Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration.
We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z) - IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model [7.842507196763463]
IRSRMamba is a novel framework integrating wavelet transform feature modulation for multi-scale adaptation.
IRSRMamba outperforms state-of-the-art methods in PSNR, SSIM, and perceptual quality.
This work establishes Mamba-based architectures as a promising direction for high-fidelity IR image enhancement.
arXiv Detail & Related papers (2024-05-16T07:49:24Z) - Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration [46.96362010335177]
In this paper, we propose HIT, a simple yet effective High-frequency Injected Transformer for image restoration.
Specifically, we design a window-wise injection module (WIM), which incorporates abundant high-frequency details into the feature map, to provide reliable references for restoring high-quality images.
In addition, we introduce a spatial enhancement unit (SEU) to preserve essential spatial relationships that may be lost due to the computations carried out across channel dimensions in the BIM.
arXiv Detail & Related papers (2024-03-30T08:05:00Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Steerable Conditional Diffusion for Out-of-Distribution Adaptation in Medical Image Reconstruction [75.91471250967703]
We introduce a novel sampling framework called Steerable Conditional Diffusion.
This framework adapts the diffusion model, concurrently with image reconstruction, based solely on the information provided by the available measurement.
We achieve substantial enhancements in out-of-distribution performance across diverse imaging modalities.
arXiv Detail & Related papers (2023-08-28T08:47:06Z) - Physics-Driven Turbulence Image Restoration with Stochastic Refinement [80.79900297089176]
Image distortion by atmospheric turbulence is a critical problem in long-range optical imaging systems.
Fast and physics-grounded simulation tools have been introduced to help the deep-learning models adapt to real-world turbulence conditions.
This paper proposes the Physics-integrated Restoration Network (PiRN) to help the network to disentangle theity from the degradation and the underlying image.
arXiv Detail & Related papers (2023-07-20T05:49:21Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Stable Deep MRI Reconstruction using Generative Priors [13.400444194036101]
We propose a novel deep neural network based regularizer which is trained in a generative setting on reference magnitude images only.
The results demonstrate competitive performance, on par with state-of-the-art end-to-end deep learning methods.
arXiv Detail & Related papers (2022-10-25T08:34:29Z) - Multi-head Cascaded Swin Transformers with Attention to k-space Sampling
Pattern for Accelerated MRI Reconstruction [16.44971774468092]
We propose a physics-based stand-alone (convolution free) transformer model titled, the Multi-head Cascaded Swin Transformers (McSTRA) for accelerated MRI reconstruction.
Our model significantly outperforms state-of-the-art MRI reconstruction methods both visually and quantitatively.
arXiv Detail & Related papers (2022-07-18T07:21:56Z) - Adaptive Diffusion Priors for Accelerated MRI Reconstruction [0.9895793818721335]
Deep MRI reconstruction is commonly performed with conditional models that de-alias undersampled acquisitions to recover images consistent with fully-sampled data.
Unconditional models instead learn generative image priors decoupled from the operator to improve reliability against domain shifts related to the imaging operator.
Here we propose the first adaptive diffusion prior for MRI reconstruction, AdaDiff, to improve performance and reliability against domain shifts.
arXiv Detail & Related papers (2022-07-12T22:45:08Z) - Scale-Equivariant Unrolled Neural Networks for Data-Efficient
Accelerated MRI Reconstruction [33.82162420709648]
We propose modeling the proximal operators of unrolled neural networks with scale-equivariant convolutional neural networks.
Our approach demonstrates strong improvements over the state-of-the-art unrolled neural networks under the same memory constraints.
arXiv Detail & Related papers (2022-04-21T23:29:52Z) - HUMUS-Net: Hybrid unrolled multi-scale network architecture for
accelerated MRI reconstruction [38.0542877099235]
HUMUS-Net is a hybrid architecture that combines the beneficial implicit bias and efficiency of convolutions with the power of Transformer blocks in an unrolled and multi-scale network.
Our network establishes new state of the art on the largest publicly available MRI dataset, the fastMRI dataset.
arXiv Detail & Related papers (2022-03-15T19:26:29Z) - ResViT: Residual vision transformers for multi-modal medical image
synthesis [0.0]
We propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers.
Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.
arXiv Detail & Related papers (2021-06-30T12:57:37Z) - Limited-angle tomographic reconstruction of dense layered objects by
dynamical machine learning [68.9515120904028]
Limited-angle tomography of strongly scattering quasi-transparent objects is a challenging, highly ill-posed problem.
Regularizing priors are necessary to reduce artifacts by improving the condition of such problems.
We devised a recurrent neural network (RNN) architecture with a novel split-convolutional gated recurrent unit (SC-GRU) as the building block.
arXiv Detail & Related papers (2020-07-21T11:48:22Z) - Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR)
Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data.
We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.