Related papers: Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation

Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation

URL: http://arxiv.org/abs/2507.14575v1
Date: Sat, 19 Jul 2025 10:58:02 GMT
Title: Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation
Authors: Andrea Moschetto, Lemuel Puglisi, Alec Sargood, Pierluigi Dell'Acqua, Francesco Guarnera, Sebastiano Battiato, Daniele Ravì,
Abstract summary: I2I approaches aim to synthesize MRI contrasts while preserving diagnostic quality.<n>In this paper, we present a comprehensive framework for generative acquisition models, diffusion models, and flow-based matching techniques.<n>Results suggest that flow-based models are prone to overfitting datasets and simpler tasks, and may require more data to match or surpass existing methods.
Score: 3.9672323132974525
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Magnetic Resonance Imaging (MRI) enables the acquisition of multiple image contrasts, such as T1-weighted (T1w) and T2-weighted (T2w) scans, each offering distinct diagnostic insights. However, acquiring all desired modalities increases scan time and cost, motivating research into computational methods for cross-modal synthesis. To address this, recent approaches aim to synthesize missing MRI contrasts from those already acquired, reducing acquisition time while preserving diagnostic quality. Image-to-image (I2I) translation provides a promising framework for this task. In this paper, we present a comprehensive benchmark of generative models$\unicode{x2013}$specifically, Generative Adversarial Networks (GANs), diffusion models, and flow matching (FM) techniques$\unicode{x2013}$for T1w-to-T2w 2D MRI I2I translation. All frameworks are implemented with comparable settings and evaluated on three publicly available MRI datasets of healthy adults. Our quantitative and qualitative analyses show that the GAN-based Pix2Pix model outperforms diffusion and FM-based methods in terms of structural fidelity, image quality, and computational efficiency. Consistent with existing literature, these results suggest that flow-based models are prone to overfitting on small datasets and simpler tasks, and may require more data to match or surpass GAN performance. These findings offer practical guidance for deploying I2I translation techniques in real-world MRI workflows and highlight promising directions for future research in cross-modal medical image synthesis. Code and models are publicly available at https://github.com/AndreaMoschetto/medical-I2I-benchmark.

Related papers

RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining [64.66825253356869]
We propose a novel methodology that leverages dense radiology reports to define image-wise similarity ordering at multiple granularities.<n>We construct two comprehensive medical imaging retrieval datasets: MIMIC-IR for Chest X-rays and CTRATE-IR for CT scans.<n>We develop two retrieval systems, RadIR-CXR and model-ChestCT, which demonstrate superior performance in traditional image-image and image-report retrieval tasks.
arXiv Detail & Related papers (2025-03-06T17:43:03Z)
Domain-Agnostic Stroke Lesion Segmentation Using Physics-Constrained Synthetic Data [0.15749416770494706]
We introduce two physics-constrained approaches to generate synthetic quantitative MRI (qMRI) images.<n>Our first method, $textttqATLAS$, trains a neural network to estimate qMRI maps from standard MPRAGE images.<n>The second method, $textttq Synth$, synthesises qMRI maps directly from tissue labels.
arXiv Detail & Related papers (2024-12-04T13:52:05Z)
SAM-I2I: Unleash the Power of Segment Anything Model for Medical Image Translation [0.9626666671366836]
We propose SAM-I2I, a novel image-to-image translation framework based on the Segment Anything Model 2 (SAM2). Our experiments on multi-contrast MRI datasets demonstrate that SAM-I2I outperforms state-of-the-art methods, offering more efficient and accurate medical image translation.
arXiv Detail & Related papers (2024-11-13T03:30:10Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.<n>Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis [7.542742087154667]
This paper introduces the Neural Transcoding Vision Transformer (modelname) modelname is a generative model designed to estimate high-resolution functional Magnetic Resonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG) data.
arXiv Detail & Related papers (2024-09-18T09:38:08Z)
Cross-conditioned Diffusion Model for Medical Image to Image Translation [22.020931436223204]
We introduce a Cross-conditioned Diffusion Model (CDM) for medical image-to-image translation. First, we propose a Modality-specific Representation Model (MRM) to model the distribution of target modalities. Then, we design a Modality-decoupled Diffusion Network (MDN) to efficiently and effectively learn the distribution from MRM.
arXiv Detail & Related papers (2024-09-13T02:48:56Z)
On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input. DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases. We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z)
An Attentive-based Generative Model for Medical Image Synthesis [18.94900480135376]
We propose an attention-based dual contrast generative model, called ADC-cycleGAN, which can synthesize medical images from unpaired data with multiple slices. The model integrates a dual contrast loss term with the CycleGAN loss to ensure that the synthesized images are distinguishable from the source domain. Experimental results demonstrate that the proposed ADC-cycleGAN model produces comparable samples to other state-of-the-art generative models.
arXiv Detail & Related papers (2023-06-02T14:17:37Z)
MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2. We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z)
Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction [68.80715727288514]
We show how to unfold an iterative MGDUN algorithm into a novel model-guided deep unfolding network by taking the MRI observation matrix. In this paper, we propose a novel Model-Guided interpretable Deep Unfolding Network (MGDUN) for medical image SR reconstruction.
arXiv Detail & Related papers (2022-09-15T03:58:30Z)
InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images [53.4351366246531]
We construct a novel interpretable dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. We analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance.
arXiv Detail & Related papers (2021-12-23T15:52:37Z)
Modality Completion via Gaussian Process Prior Variational Autoencoders for Multi-Modal Glioma Segmentation [75.58395328700821]
We propose a novel model, Multi-modal Gaussian Process Prior Variational Autoencoder (MGP-VAE), to impute one or more missing sub-modalities for a patient scan. MGP-VAE can leverage the Gaussian Process (GP) prior on the Variational Autoencoder (VAE) to utilize the subjects/patients and sub-modalities correlations. We show the applicability of MGP-VAE on brain tumor segmentation where either, two, or three of four sub-modalities may be missing.
arXiv Detail & Related papers (2021-07-07T19:06:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.