Related papers: Multimodal-Boost: Multimodal Medical Image Super-Resolution using Multi-Attention Network with Wavelet Transform

Multimodal-Boost: Multimodal Medical Image Super-Resolution using Multi-Attention Network with Wavelet Transform

URL: http://arxiv.org/abs/2110.11684v1
Date: Fri, 22 Oct 2021 10:13:46 GMT
Title: Multimodal-Boost: Multimodal Medical Image Super-Resolution using Multi-Attention Network with Wavelet Transform
Authors: Farah Deeba, Fayaz Ali Dharejo, Muhammad Zawish, Yuanchun Zhou, Kapal Dev, Sunder Ali Khowaja, and Nawab Muhammad Faseeh Qureshi
Abstract summary: Loss of corresponding image resolution degrades the overall performance of medical image diagnosis. Deep learning based single image super resolution (SISR) algorithms has revolutionized the overall diagnosis framework. This work proposes generative adversarial network (GAN) with deep multi-attention modules to learn high-frequency information from low-frequency data.
Score: 5.416279158834623
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multimodal medical images are widely used by clinicians and physicians to analyze and retrieve complementary information from high-resolution images in a non-invasive manner. The loss of corresponding image resolution degrades the overall performance of medical image diagnosis. Deep learning based single image super resolution (SISR) algorithms has revolutionized the overall diagnosis framework by continually improving the architectural components and training strategies associated with convolutional neural networks (CNN) on low-resolution images. However, existing work lacks in two ways: i) the SR output produced exhibits poor texture details, and often produce blurred edges, ii) most of the models have been developed for a single modality, hence, require modification to adapt to a new one. This work addresses (i) by proposing generative adversarial network (GAN) with deep multi-attention modules to learn high-frequency information from low-frequency data. Existing approaches based on the GAN have yielded good SR results; however, the texture details of their SR output have been experimentally confirmed to be deficient for medical images particularly. The integration of wavelet transform (WT) and GANs in our proposed SR model addresses the aforementioned limitation concerning textons. The WT divides the LR image into multiple frequency bands, while the transferred GAN utilizes multiple attention and upsample blocks to predict high-frequency components. Moreover, we present a learning technique for training a domain-specific classifier as a perceptual loss function. Combining multi-attention GAN loss with a perceptual loss function results in a reliable and efficient performance. Applying the same model for medical images from diverse modalities is challenging, our work addresses (ii) by training and performing on several modalities via transfer learning.

Related papers

Resolution-Independent Neural Operators for Multi-Rate Sparse-View CT [67.14700058302016]
Deep learning methods achieve high-fidelity reconstructions but often overfit to a fixed acquisition setup.<n>We propose Computed Tomography neural Operator (CTO), a unified CT reconstruction framework that extends to continuous function space.<n>CTO enables consistent multi-sampling-rate and cross-resolution performance, with on average >4dB PSNR gain over CNNs.
arXiv Detail & Related papers (2025-12-13T08:31:46Z)
Nexus-INR: Diverse Knowledge-guided Arbitrary-Scale Multimodal Medical Image Super-Resolution [14.992795611397579]
Arbitrary-resolution super-resolution provides crucial flexibility for medical image analysis by adapting to diverse spatial resolutions.<n>Traditional CNN-based methods are inherently ill-suited for ARSR, as they are typically designed for fixed upsampling factors.<n>We propose Nexus-INR, a Diverse Knowledge-guided ARSR framework, which employs varied information and downstream tasks to achieve high-quality, adaptive-resolution medical image super-resolution.
arXiv Detail & Related papers (2025-08-05T04:44:35Z)
Frequency-enhanced Multi-granularity Context Network for Efficient Vertebrae Segmentation [33.99418884128739]
We introduce a Frequency-enhanced Multi-granularity Context Network (FMC-Net) to improve vertebrae segmentation accuracy.<n>For the high-frequency components, we apply a High-frequency Feature Refinement (HFR) to amplify the prominence of key features.<n>For the low-frequency components, we use a Multi-granularity State Space Model (MG-SSM) to aggregate feature representations with different receptive fields.
arXiv Detail & Related papers (2025-06-29T04:53:02Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
Deep neural networks have shown great potential for reconstructing high-fidelity images from undersampled measurements. Our model is based on neural operators, a discretization-agnostic architecture. Our inference speed is also 1,400x faster than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
Applying Conditional Generative Adversarial Networks for Imaging Diagnosis [3.881664394416534]
This study introduces an innovative application of Conditional Generative Adversarial Networks (C-GAN) integrated with Stacked Hourglass Networks (SHGN) We address the problem of overfitting, common in deep learning models applied to complex imaging datasets, by augmenting data through rotation and scaling. A hybrid loss function combining L1 and L2 reconstruction losses, enriched with adversarial training, is introduced to refine segmentation processes in intravascular ultrasound (IVUS) imaging.
arXiv Detail & Related papers (2024-07-17T23:23:09Z)
NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals. By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z)
Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI [23.50915512118989]
Multi-contrast Super-Resolution (SR) reconstruction is promising to yield SR images with higher quality. radiologists are accustomed to zooming the MR images at arbitrary scales rather than using a fixed scale. We propose an implicit neural representations based dual-arbitrary multi-contrast MRI super-resolution method, called Dual-ArbNet.
arXiv Detail & Related papers (2023-07-05T14:43:26Z)
On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input. DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases. We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z)
Convolutional neural network based on sparse graph attention mechanism for MRI super-resolution [0.34410212782758043]
Medical image super-resolution (SR) reconstruction using deep learning techniques can enhance lesion analysis and assist doctors in improving diagnostic efficiency and accuracy. Existing deep learning-based SR methods rely on convolutional neural networks (CNNs), which inherently limit the expressive capabilities of these models. We propose an A-network that utilizes multiple convolution operator feature extraction modules (MCO) for extracting image features.
arXiv Detail & Related papers (2023-05-29T06:14:22Z)
Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution [55.52779466954026]
Multi-contrast super-resolution (SR) reconstruction is promising to yield SR images with higher quality. Existing methods lack effective mechanisms to match and fuse these features for better reconstruction. We propose a novel network to address these problems by developing a set of innovative Transformer-empowered multi-scale contextual matching and aggregation techniques.
arXiv Detail & Related papers (2022-03-26T01:42:59Z)
Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality. In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network. Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z)
ResViT: Residual vision transformers for multi-modal medical image synthesis [0.0]
We propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers. Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.
arXiv Detail & Related papers (2021-06-30T12:57:37Z)
Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network. Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.