SinGAN-Seg: Synthetic Training Data Generation for Medical Image
Segmentation
- URL: http://arxiv.org/abs/2107.00471v1
- Date: Tue, 29 Jun 2021 19:34:34 GMT
- Title: SinGAN-Seg: Synthetic Training Data Generation for Medical Image
Segmentation
- Authors: Vajira Thambawita, Pegah Salehi, Sajad Amouei Sheshkal, Steven A.
Hicks, Hugo L.Hammer, Sravanthi Parasa, Thomas de Lange, P{\aa}l Halvorsen,
Michael A. Riegler
- Abstract summary: We present a novel synthetic data generation pipeline called SinGAN-Seg to produce synthetic medical data with the corresponding annotated ground truth masks.
We show that these synthetic data generation pipelines can be used as an alternative to bypass privacy concerns.
In addition, we show that synthetic data generated from the SinGAN-Seg pipeline improving the performance of segmentation algorithms when the training dataset is very small.
- Score: 0.7444812797273735
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Processing medical data to find abnormalities is a time-consuming and costly
task, requiring tremendous efforts from medical experts. Therefore, Ai has
become a popular tool for the automatic processing of medical data, acting as a
supportive tool for doctors. AI tools highly depend on data for training the
models. However, there are several constraints to access to large amounts of
medical data to train machine learning algorithms in the medical domain, e.g.,
due to privacy concerns and the costly, time-consuming medical data annotation
process. To address this, in this paper we present a novel synthetic data
generation pipeline called SinGAN-Seg to produce synthetic medical data with
the corresponding annotated ground truth masks. We show that these synthetic
data generation pipelines can be used as an alternative to bypass privacy
concerns and as an alternative way to produce artificial segmentation datasets
with corresponding ground truth masks to avoid the tedious medical data
annotation process. As a proof of concept, we used an open polyp segmentation
dataset. By training UNet++ using both the real polyp segmentation dataset and
the corresponding synthetic dataset generated from the SinGAN-Seg pipeline, we
show that the synthetic data can achieve a very close performance to the real
data when the real segmentation datasets are large enough. In addition, we show
that synthetic data generated from the SinGAN-Seg pipeline improving the
performance of segmentation algorithms when the training dataset is very small.
Since our SinGAN-Seg pipeline is applicable for any medical dataset, this
pipeline can be used with any other segmentation datasets.
Related papers
- EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing [5.900946696794718]
We present a model designed to produce high-fidelity, long and accessible complete data samples with near-real-time efficiency.
We develop our generation method based on diffusion models and introduce a protocol for medical video dataset anonymization.
We present EchoNet-Synthetic, a fully synthetic, privacy-compliant echocardiogram dataset with paired ejection fraction labels.
arXiv Detail & Related papers (2024-06-02T17:18:06Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A
Comprehensive Benchmark [56.8042116967334]
Synthetic data serves as an alternative in training machine learning models.
ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging task.
This paper explores the potential of integrating data-centric AI techniques to guide the synthetic data generation process.
arXiv Detail & Related papers (2023-10-25T20:32:02Z) - How Good Are Synthetic Medical Images? An Empirical Study with Lung
Ultrasound [0.3312417881789094]
Adding synthetic training data using generative models offers a low-cost method to deal with the data scarcity challenge.
We show that training with both synthetic and real data outperforms training with real data alone.
arXiv Detail & Related papers (2023-10-05T15:42:53Z) - Generative Adversarial Networks for Data Augmentation [0.0]
GANs have been utilized in medical image analysis for various tasks, including data augmentation, image creation, and domain adaptation.
GANs can generate synthetic samples that can be used to increase the available dataset.
It is essential to note that the use of GANs in medical imaging is still an active area of research to ensure that the produced images are of high quality and suitable for use in clinical settings.
arXiv Detail & Related papers (2023-06-03T06:33:33Z) - Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task.
We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z) - FairGen: Fair Synthetic Data Generation [0.3149883354098941]
We propose a pipeline to generate fairer synthetic data independent of the GAN architecture.
We claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples.
arXiv Detail & Related papers (2022-10-24T08:13:47Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Ensembles of GANs for synthetic training data generation [7.835101177261939]
Insufficient training data is a major bottleneck for most deep learning practices.
This work investigates the use of synthetic images, created by generative adversarial networks (GANs), as the only source of training data.
arXiv Detail & Related papers (2021-04-23T19:38:48Z) - Co-Generation and Segmentation for Generalized Surgical Instrument
Segmentation on Unlabelled Data [49.419268399590045]
Surgical instrument segmentation for robot-assisted surgery is needed for accurate instrument tracking and augmented reality overlays.
Deep learning-based methods have shown state-of-the-art performance for surgical instrument segmentation, but their results depend on labelled data.
In this paper, we demonstrate the limited generalizability of these methods on different datasets, including human robot-assisted surgeries.
arXiv Detail & Related papers (2021-03-16T18:41:18Z) - Learning to Segment Human Body Parts with Synthetically Trained Deep
Convolutional Networks [58.0240970093372]
This paper presents a new framework for human body part segmentation based on Deep Convolutional Neural Networks trained using only synthetic data.
The proposed approach achieves cutting-edge results without the need of training the models with real annotated data of human body parts.
arXiv Detail & Related papers (2021-02-02T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.