A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis
- URL: http://arxiv.org/abs/2509.16582v1
- Date: Sat, 20 Sep 2025 09:08:08 GMT
- Title: A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis
- Authors: Antonio Scardace, Lemuel Puglisi, Francesco Guarnera, Sebastiano Battiato, Daniele Ravì,
- Abstract summary: DeepSSIM is a novel metric for quantifying memorization in generative models.<n>DeepSSIM achieves superior performance, improving F1 scores by an average of +52.03% over the best existing method.
- Score: 4.16184304316315
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep generative models have emerged as a transformative tool in medical imaging, offering substantial potential for synthetic data generation. However, recent empirical studies highlight a critical vulnerability: these models can memorize sensitive training data, posing significant risks of unauthorized patient information disclosure. Detecting memorization in generative models remains particularly challenging, necessitating scalable methods capable of identifying training data leakage across large sets of generated samples. In this work, we propose DeepSSIM, a novel self-supervised metric for quantifying memorization in generative models. DeepSSIM is trained to: i) project images into a learned embedding space and ii) force the cosine similarity between embeddings to match the ground-truth SSIM (Structural Similarity Index) scores computed in the image space. To capture domain-specific anatomical features, training incorporates structure-preserving augmentations, allowing DeepSSIM to estimate similarity reliably without requiring precise spatial alignment. We evaluate DeepSSIM in a case study involving synthetic brain MRI data generated by a Latent Diffusion Model (LDM) trained under memorization-prone conditions, using 2,195 MRI scans from two publicly available datasets (IXI and CoRR). Compared to state-of-the-art memorization metrics, DeepSSIM achieves superior performance, improving F1 scores by an average of +52.03% over the best existing method. Code and data of our approach are publicly available at the following link: https://github.com/brAIn-science/DeepSSIM.
Related papers
- From Healthy Scans to Annotated Tumors: A Tumor Fabrication Framework for 3D Brain MRI Synthesis [3.295857224165814]
Tumor Fabrication (TF) is a novel two-stage framework for unpaired 3D brain tumor synthesis.<n>TF is fully automated and leverages only healthy image scans along with a limited amount of real annotated data.<n>We demonstrate that our synthetic image-label pairs used as data enrichment can significantly improve performance on downstream tumor segmentation tasks in low-data regimes.
arXiv Detail & Related papers (2025-11-23T23:28:49Z) - Adapting HFMCA to Graph Data: Self-Supervised Learning for Generalizable fMRI Representations [57.054499278843856]
Functional magnetic resonance imaging (fMRI) analysis faces significant challenges due to limited dataset sizes and domain variability between studies.<n>Traditional self-supervised learning methods inspired by computer vision often rely on positive and negative sample pairs.<n>We propose adapting a recently developed Hierarchical Functional Maximal Correlation Algorithm (HFMCA) to graph-structured fMRI data.
arXiv Detail & Related papers (2025-10-05T12:35:01Z) - Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z) - Enhancing Privacy: The Utility of Stand-Alone Synthetic CT and MRI for Tumor and Bone Segmentation [2.4345008922715756]
We employ head and neck cancer CT scans and brain glioma MRI scans from two large datasets.<n>Synthetic data were generated using generative adversarial networks and diffusion models.<n>We evaluate the quality of the synthetic data using MAE, MS-SSIM, Radiomics and a Visual Turing Test (VTT) performed by 5 radiologists.
arXiv Detail & Related papers (2025-06-13T08:17:48Z) - PhaseGen: A Diffusion-Based Approach for Complex-Valued MRI Data Generation [1.683019219727036]
Magnetic resonance imaging (MRI) raw data, or k-Space data, is complex-valued, containing both magnitude and phase information.<n>We introduce $textitPhaseGen$, a novel complex-valued diffusion model for generating synthetic MRI raw data conditioned on magnitude images.<n>Our results show that training with synthetic phase data significantly improves generalization for skull-stripping on real-world data.
arXiv Detail & Related papers (2025-04-10T08:44:19Z) - ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning [51.26601171361753]
We propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process.<n>We show that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance.
arXiv Detail & Related papers (2025-01-08T05:15:43Z) - MRGen: Segmentation Data Engine for Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically important imaging modalities is challenging due to the scarcity of annotated data.<n>This paper investigates leveraging generative models to synthesize data, for training segmentation models for underrepresented modalities.<n>We present MRGen, a data engine for controllable medical image synthesis conditioned on text prompts and segmentation masks.
arXiv Detail & Related papers (2024-12-04T16:34:22Z) - 3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes [2.8498944632323755]
We propose a novel slice-based latent diffusion architecture to address the complexities of volumetric data generation.
This approach extends the joint distribution modeling of medical images and their associated masks, allowing a simultaneous generation of both under data-scarce regimes.
Our architecture can be conditioned by tumor characteristics, including size, shape, and relative position, thereby providing a diverse range of tumor variations.
arXiv Detail & Related papers (2024-06-08T09:53:45Z) - Brain Tumor Synthetic Data Generation with Adaptive StyleGANs [6.244557340851846]
We present a method to generate brain tumor MRI images using generative adversarial networks.
Results demonstrate that the proposed method can learn the distributions of brain tumors.
The approach can addresses the limited data availability by generating realistic-looking brain MRI with tumors.
arXiv Detail & Related papers (2022-12-04T09:01:33Z) - FAST-AID Brain: Fast and Accurate Segmentation Tool using Artificial
Intelligence Developed for Brain [0.8376091455761259]
A novel deep learning method is proposed for fast and accurate segmentation of the human brain into 132 regions.
The proposed model uses an efficient U-Net-like network and benefits from the intersection points of different views and hierarchical relations.
The proposed method can be applied to brain MRI data including skull or any other artifacts without preprocessing the images or a drop in performance.
arXiv Detail & Related papers (2022-08-30T16:06:07Z) - Deep Representational Similarity Learning for analyzing neural
signatures in task-based fMRI dataset [81.02949933048332]
This paper develops Deep Representational Similarity Learning (DRSL), a deep extension of Representational Similarity Analysis (RSA)
DRSL is appropriate for analyzing similarities between various cognitive tasks in fMRI datasets with a large number of subjects.
arXiv Detail & Related papers (2020-09-28T18:30:14Z) - Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules.
We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.