Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation
- URL: http://arxiv.org/abs/2502.03825v1
- Date: Thu, 06 Feb 2025 07:21:19 GMT
- Title: Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation
- Authors: Tianhao Li, Tianyu Zeng, Yujia Zheng, Chulong Zhang, Jingyu Lu, Haotian Huang, Chuangxin Chu, Fang-Fang Yin, Zhenyu Yang,
- Abstract summary: We investigate the impact of synthetic MRI data on the robustness and segmentation accuracy of U-Net models for brain tumor segmentation.
To quantify the effect of synthetic data contamination, we train U-Net models on progressively "poisoned" datasets.
- Score: 8.955776982854985
- License:
- Abstract: Deep learning-based medical image segmentation models, such as U-Net, rely on high-quality annotated datasets to achieve accurate predictions. However, the increasing use of generative models for synthetic data augmentation introduces potential risks, particularly in the absence of rigorous quality control. In this paper, we investigate the impact of synthetic MRI data on the robustness and segmentation accuracy of U-Net models for brain tumor segmentation. Specifically, we generate synthetic T1-contrast-enhanced (T1-Ce) MRI scans using a GAN-based model with a shared encoding-decoding framework and shortest-path regularization. To quantify the effect of synthetic data contamination, we train U-Net models on progressively "poisoned" datasets, where synthetic data proportions range from 16.67% to 83.33%. Experimental results on a real MRI validation set reveal a significant performance degradation as synthetic data increases, with Dice coefficients dropping from 0.8937 (33.33% synthetic) to 0.7474 (83.33% synthetic). Accuracy and sensitivity exhibit similar downward trends, demonstrating the detrimental effect of synthetic data on segmentation robustness. These findings underscore the importance of quality control in synthetic data integration and highlight the risks of unregulated synthetic augmentation in medical image analysis. Our study provides critical insights for the development of more reliable and trustworthy AI-driven medical imaging systems.
Related papers
- ScaleMAI: Accelerating the Development of Trusted Datasets and AI Models [46.80682547774335]
We propose ScaleMAI, an agent of AI-integrated data curation and annotation.
First, ScaleMAI creates a dataset of 25,362 CT scans, including per-voxel annotations for benign/malignant tumors and 24 anatomical structures.
Second, through progressive human-in-the-loop iterations, ScaleMAI provides Flagship AI Model that can approach the proficiency of expert annotators in detecting pancreatic tumors.
arXiv Detail & Related papers (2025-01-06T22:12:00Z) - Embryo 2.0: Merging Synthetic and Real Data for Advanced AI Predictions [69.07284335967019]
We train two generative models using two datasets, one created and made publicly available, and one existing public dataset.
We generate synthetic embryo images at various cell stages, including 2-cell, 4-cell, 8-cell, morula, and blastocyst.
These were combined with real images to train classification models for embryo cell stage prediction.
arXiv Detail & Related papers (2024-12-02T08:24:49Z) - Cancer-Net SCa-Synth: An Open Access Synthetically Generated 2D Skin Lesion Dataset for Skin Cancer Classification [65.83291923029985]
In the United States, skin cancer ranks as the most commonly diagnosed cancer, presenting a significant public health issue.
Recent advancements in dataset curation and deep learning have shown promise in quick and accurate detection of skin cancer.
Cancer-Net SCa- Synth is an open access synthetically generated 2D skin lesion dataset for skin cancer classification.
arXiv Detail & Related papers (2024-11-08T02:04:21Z) - Guided Synthesis of Labeled Brain MRI Data Using Latent Diffusion Models for Segmentation of Enlarged Ventricles [0.4188114563181614]
Deep learning models in medical contexts face challenges like data scarcity, inhomogeneity, and privacy concerns.
This study focuses on improving ventricular segmentation in brain MRI images using synthetic data.
arXiv Detail & Related papers (2024-11-02T19:44:10Z) - Brain Tumor Classification on MRI in Light of Molecular Markers [61.77272414423481]
Co-deletion of the 1p/19q gene is associated with clinical outcomes in low-grade gliomas.
This study aims to utilize a specially MRI-based convolutional neural network for brain cancer detection.
arXiv Detail & Related papers (2024-09-29T07:04:26Z) - Synthetically Enhanced: Unveiling Synthetic Data's Potential in Medical Imaging Research [4.475998415951477]
Generative AI offers a promising approach to generating synthetic images, enhancing dataset diversity.
This study investigates the impact of synthetic data supplementation on the performance and generalizability of medical imaging research.
arXiv Detail & Related papers (2023-11-15T21:58:01Z) - Synthetic Data as Validation [9.506660694536649]
We illustrate the effectiveness of synthetic data for early cancer detection in computed tomography (CT) volumes.
We establish a new continual learning framework that continuously trains AI models on a stream of out-domain data with synthetic tumors.
The AI model trained and validated in dynamically expanding synthetic data can consistently outperform models trained and validated exclusively on real-world data.
arXiv Detail & Related papers (2023-10-24T17:59:55Z) - Metadata-Conditioned Generative Models to Synthesize
Anatomically-Plausible 3D Brain MRIs [12.492451825171408]
We propose a new generative model, Brain Synth, to synthesize metadata-conditioned (e.g., age- and sex-specific) MRIs.
Results indicate that more than half of the brain regions in our synthetic MRIs are anatomically accurate, with a small effect size between real and synthetic MRIs.
Our synthetic MRIs can significantly improve the training of a Convolutional Neural Network to identify accelerated aging effects.
arXiv Detail & Related papers (2023-10-07T00:05:47Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - Functional Magnetic Resonance Imaging data augmentation through
conditional ICA [44.483210864902304]
We introduce Conditional Independent Components Analysis (Conditional ICA): a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique.
We show that Conditional ICA is successful at synthesizing data indistinguishable from observations, and that it yields gains in classification accuracy in brain decoding problems.
arXiv Detail & Related papers (2021-07-11T22:36:14Z) - Overcoming Barriers to Data Sharing with Medical Image Generation: A
Comprehensive Evaluation [17.983449515155414]
We utilize Generative Adversarial Networks (GANs) to create derived medical imaging datasets consisting entirely of synthetic patient data.
The synthetic images ideally have, in aggregate, similar statistical properties to those of a source dataset but do not contain sensitive personal information.
We measure the synthetic image quality by the performance difference of predictive models trained on either the synthetic or the real dataset.
arXiv Detail & Related papers (2020-11-29T15:41:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.