Multimodal-Boost: Multimodal Medical Image Super-Resolution using
Multi-Attention Network with Wavelet Transform
- URL: http://arxiv.org/abs/2110.11684v1
- Date: Fri, 22 Oct 2021 10:13:46 GMT
- Title: Multimodal-Boost: Multimodal Medical Image Super-Resolution using
Multi-Attention Network with Wavelet Transform
- Authors: Farah Deeba, Fayaz Ali Dharejo, Muhammad Zawish, Yuanchun Zhou, Kapal
Dev, Sunder Ali Khowaja, and Nawab Muhammad Faseeh Qureshi
- Abstract summary: Loss of corresponding image resolution degrades the overall performance of medical image diagnosis.
Deep learning based single image super resolution (SISR) algorithms has revolutionized the overall diagnosis framework.
This work proposes generative adversarial network (GAN) with deep multi-attention modules to learn high-frequency information from low-frequency data.
- Score: 5.416279158834623
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multimodal medical images are widely used by clinicians and physicians to
analyze and retrieve complementary information from high-resolution images in a
non-invasive manner. The loss of corresponding image resolution degrades the
overall performance of medical image diagnosis. Deep learning based single
image super resolution (SISR) algorithms has revolutionized the overall
diagnosis framework by continually improving the architectural components and
training strategies associated with convolutional neural networks (CNN) on
low-resolution images. However, existing work lacks in two ways: i) the SR
output produced exhibits poor texture details, and often produce blurred edges,
ii) most of the models have been developed for a single modality, hence,
require modification to adapt to a new one. This work addresses (i) by
proposing generative adversarial network (GAN) with deep multi-attention
modules to learn high-frequency information from low-frequency data. Existing
approaches based on the GAN have yielded good SR results; however, the texture
details of their SR output have been experimentally confirmed to be deficient
for medical images particularly. The integration of wavelet transform (WT) and
GANs in our proposed SR model addresses the aforementioned limitation
concerning textons. The WT divides the LR image into multiple frequency bands,
while the transferred GAN utilizes multiple attention and upsample blocks to
predict high-frequency components. Moreover, we present a learning technique
for training a domain-specific classifier as a perceptual loss function.
Combining multi-attention GAN loss with a perceptual loss function results in a
reliable and efficient performance. Applying the same model for medical images
from diverse modalities is challenging, our work addresses (ii) by training and
performing on several modalities via transfer learning.
Related papers
- Unifying Subsampling Pattern Variations for Compressed Sensing MRI with Neural Operators [72.79532467687427]
Compressed Sensing MRI reconstructs images of the body's internal anatomy from undersampled and compressed measurements.
Deep neural networks have shown great potential for reconstructing high-quality images from highly undersampled measurements.
We propose a unified model that is robust to different subsampling patterns and image resolutions in CS-MRI.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - Applying Conditional Generative Adversarial Networks for Imaging Diagnosis [3.881664394416534]
This study introduces an innovative application of Conditional Generative Adversarial Networks (C-GAN) integrated with Stacked Hourglass Networks (SHGN)
We address the problem of overfitting, common in deep learning models applied to complex imaging datasets, by augmenting data through rotation and scaling.
A hybrid loss function combining L1 and L2 reconstruction losses, enriched with adversarial training, is introduced to refine segmentation processes in intravascular ultrasound (IVUS) imaging.
arXiv Detail & Related papers (2024-07-17T23:23:09Z) - NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals.
By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z) - Dual Arbitrary Scale Super-Resolution for Multi-Contrast MRI [23.50915512118989]
Multi-contrast Super-Resolution (SR) reconstruction is promising to yield SR images with higher quality.
radiologists are accustomed to zooming the MR images at arbitrary scales rather than using a fixed scale.
We propose an implicit neural representations based dual-arbitrary multi-contrast MRI super-resolution method, called Dual-ArbNet.
arXiv Detail & Related papers (2023-07-05T14:43:26Z) - On Sensitivity and Robustness of Normalization Schemes to Input
Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input.
DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases.
We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z) - Convolutional neural network based on sparse graph attention mechanism
for MRI super-resolution [0.34410212782758043]
Medical image super-resolution (SR) reconstruction using deep learning techniques can enhance lesion analysis and assist doctors in improving diagnostic efficiency and accuracy.
Existing deep learning-based SR methods rely on convolutional neural networks (CNNs), which inherently limit the expressive capabilities of these models.
We propose an A-network that utilizes multiple convolution operator feature extraction modules (MCO) for extracting image features.
arXiv Detail & Related papers (2023-05-29T06:14:22Z) - Transformer-empowered Multi-scale Contextual Matching and Aggregation
for Multi-contrast MRI Super-resolution [55.52779466954026]
Multi-contrast super-resolution (SR) reconstruction is promising to yield SR images with higher quality.
Existing methods lack effective mechanisms to match and fuse these features for better reconstruction.
We propose a novel network to address these problems by developing a set of innovative Transformer-empowered multi-scale contextual matching and aggregation techniques.
arXiv Detail & Related papers (2022-03-26T01:42:59Z) - Multi-modal Aggregation Network for Fast MR Imaging [85.25000133194762]
We propose a novel Multi-modal Aggregation Network, named MANet, which is capable of discovering complementary representations from a fully sampled auxiliary modality.
In our MANet, the representations from the fully sampled auxiliary and undersampled target modalities are learned independently through a specific network.
Our MANet follows a hybrid domain learning framework, which allows it to simultaneously recover the frequency signal in the $k$-space domain.
arXiv Detail & Related papers (2021-10-15T13:16:59Z) - ResViT: Residual vision transformers for multi-modal medical image
synthesis [0.0]
We propose a novel generative adversarial approach for medical image synthesis, ResViT, to combine local precision of convolution operators with contextual sensitivity of vision transformers.
Our results indicate the superiority of ResViT against competing methods in terms of qualitative observations and quantitative metrics.
arXiv Detail & Related papers (2021-06-30T12:57:37Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.