RealDeal: Enhancing Realism and Details in Brain Image Generation via Image-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2507.18830v1
- Date: Thu, 24 Jul 2025 22:04:39 GMT
- Title: RealDeal: Enhancing Realism and Details in Brain Image Generation via Image-to-Image Diffusion Models
- Authors: Shen Zhu, Yinzhu Jin, Tyler Spears, Ifrah Zawar, P. Thomas Fletcher,
- Abstract summary: This work formulates the realism enhancing and detail adding process as image-to-image diffusion models.<n>We introduce new metrics to demonstrate the realism of images generated by RealDeal in terms of image noise distribution, sharpness, and texture.
- Score: 1.456352735394398
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose image-to-image diffusion models that are designed to enhance the realism and details of generated brain images by introducing sharp edges, fine textures, subtle anatomical features, and imaging noise. Generative models have been widely adopted in the biomedical domain, especially in image generation applications. Latent diffusion models achieve state-of-the-art results in generating brain MRIs. However, due to latent compression, generated images from these models are overly smooth, lacking fine anatomical structures and scan acquisition noise that are typically seen in real images. This work formulates the realism enhancing and detail adding process as image-to-image diffusion models, which refines the quality of LDM-generated images. We employ commonly used metrics like FID and LPIPS for image realism assessment. Furthermore, we introduce new metrics to demonstrate the realism of images generated by RealDeal in terms of image noise distribution, sharpness, and texture.
Related papers
- Multi-focal Conditioned Latent Diffusion for Person Image Synthesis [59.113899155476005]
The Latent Diffusion Model (LDM) has demonstrated strong capabilities in high-resolution image generation.<n>We propose a Multi-focal Conditioned Latent Diffusion (MCLD) method to address these limitations.<n>Our approach utilizes a multi-focal condition aggregation module, which effectively integrates facial identity and texture-specific information.
arXiv Detail & Related papers (2025-03-19T20:50:10Z) - FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models [14.596090302381647]
This paper studies photorealism enhancement of rendered images, leveraging generative power from diffusion models on the controlled basis of rendering.
We introduce a novel framework to translate rendered images into their realistic counterparts, which consists of two stages: Domain Knowledge Injection (DKI) and Realistic Image Generation (RIG)
arXiv Detail & Related papers (2024-10-18T12:48:22Z) - A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.<n>Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z) - XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model [0.7381551917607596]
Large-scale generative models have demonstrated impressive capabilities in producing visually compelling images.
However, they continue to grapple with hallucination challenges and the generation of anatomically inaccurate outputs.
We present XReal, a novel controllable diffusion model for generating realistic chest X-ray images.
arXiv Detail & Related papers (2024-03-14T10:03:58Z) - RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models [42.20230095700904]
RealCompo is a new training-free and transferred-friendly text-to-image generation framework.
An intuitive and novel balancer is proposed to balance the strengths of the two models in denoising process.
Our RealCompo can be seamlessly extended with a wide range of spatial-aware image diffusion models and stylized diffusion models.
arXiv Detail & Related papers (2024-02-20T10:56:52Z) - Learned representation-guided diffusion models for large-image generation [58.192263311786824]
We introduce a novel approach that trains diffusion models conditioned on embeddings from self-supervised learning (SSL)
Our diffusion models successfully project these features back to high-quality histopathology and remote sensing images.
Augmenting real data by generating variations of real images improves downstream accuracy for patch-level and larger, image-scale classification tasks.
arXiv Detail & Related papers (2023-12-12T14:45:45Z) - Local Statistics for Generative Image Detection [1.565361244756411]
Diffusion models (DMs) are generative models that learn to synthesize images from Gaussian noise.<n>In this paper, we highlighted the effectiveness of Bayer pattern and local statistics in distinguishing digital camera images from DM-generated images.
arXiv Detail & Related papers (2023-10-25T14:47:32Z) - On quantifying and improving realism of images generated with diffusion [50.37578424163951]
We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
arXiv Detail & Related papers (2023-09-26T08:32:55Z) - Generation of Structurally Realistic Retinal Fundus Images with
Diffusion Models [1.9346186297861747]
We generate artery/vein masks to create the vascular structure, which we then condition to produce retinal fundus images.
The proposed method can generate high-quality images with more realistic vascular structures.
arXiv Detail & Related papers (2023-05-11T14:09:05Z) - Natural scene reconstruction from fMRI signals using generative latent
diffusion [1.90365714903665]
We present a two-stage scene reconstruction framework called Brain-Diffuser''
In the first stage, we reconstruct images that capture low-level properties and overall layout using a VDVAE (Very Deep Vari Autoencoder) model.
In the second stage, we use the image-to-image framework of a latent diffusion model conditioned on predicted multimodal (text and visual) features.
arXiv Detail & Related papers (2023-03-09T15:24:26Z) - NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real
Image Animation [66.0838349951456]
Nerf-based Generative models have shown impressive capacity in generating high-quality images with consistent 3D geometry.
We propose a universal method to surgically fine-tune these NeRF-GAN models in order to achieve high-fidelity animation of real subjects only by a single image.
arXiv Detail & Related papers (2022-11-30T18:36:45Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.