Related papers: CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare Records

CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare Records

URL: http://arxiv.org/abs/2001.09346v2
Date: Wed, 4 Mar 2020 19:22:37 GMT
Title: CorGAN: Correlation-Capturing Convolutional Generative Adversarial Networks for Generating Synthetic Healthcare Records
Authors: Amirsina Torfi, Edward A. Fox
Abstract summary: We propose a framework called correlation-capturing Generative Adversarial Network (CorGAN) to generate synthetic healthcare records. To demonstrate the model fidelity, we show that CorGAN generates synthetic data with performance similar to that of real data in various Machine Learning settings.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning models have demonstrated high-quality performance in areas such as image classification and speech processing. However, creating a deep learning model using electronic health record (EHR) data, requires addressing particular privacy challenges that are unique to researchers in this domain. This matter focuses attention on generating realistic synthetic data while ensuring privacy. In this paper, we propose a novel framework called correlation-capturing Generative Adversarial Network (CorGAN), to generate synthetic healthcare records. In CorGAN we utilize Convolutional Neural Networks to capture the correlations between adjacent medical features in the data representation space by combining Convolutional Generative Adversarial Networks and Convolutional Autoencoders. To demonstrate the model fidelity, we show that CorGAN generates synthetic data with performance similar to that of real data in various Machine Learning settings such as classification and prediction. We also give a privacy assessment and report on statistical analysis regarding realistic characteristics of the synthetic data. The software of this work is open-source and is available at: https://github.com/astorfi/cor-gan.

Related papers

Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z)
Exploring the Impact of Synthetic Data on Human Gesture Recognition Tasks Using GANs [0.0]
This study is the first to explore the feasibility of synthesizing motion gestures for allergic rhinitis from wearable IoT device data using Generative Adversarial Networks (GANs) We also focus on these AI models' performance in terms of fidelity, diversity, and privacy.
arXiv Detail & Related papers (2024-12-09T11:15:47Z)
Cancer-Net SCa-Synth: An Open Access Synthetically Generated 2D Skin Lesion Dataset for Skin Cancer Classification [65.83291923029985]
In the United States, skin cancer ranks as the most commonly diagnosed cancer, presenting a significant public health issue. Recent advancements in dataset curation and deep learning have shown promise in quick and accurate detection of skin cancer. Cancer-Net SCa- Synth is an open access synthetically generated 2D skin lesion dataset for skin cancer classification.
arXiv Detail & Related papers (2024-11-08T02:04:21Z)
GaNDLF-Synth: A Framework to Democratize Generative AI for (Bio)Medical Imaging [0.36638033546156024]
Generative Artificial Intelligence (GenAI) is a field of AI that creates new data samples from existing ones. This paper explores the background and motivation for GenAI, and introduces the Generally Nuanced Deep Learning Framework for Synthesis (GaNDLF- Synth) GaNDLF- Synth describes a unified abstraction for various algorithms, including autoencoders, generative adversarial networks, and synthesis models.
arXiv Detail & Related papers (2024-09-30T19:25:01Z)
EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing [5.900946696794718]
We present a model designed to produce high-fidelity, long and accessible complete data samples with near-real-time efficiency. We develop our generation method based on diffusion models and introduce a protocol for medical video dataset anonymization. We present EchoNet-Synthetic, a fully synthetic, privacy-compliant echocardiogram dataset with paired ejection fraction labels.
arXiv Detail & Related papers (2024-06-02T17:18:06Z)
PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images. Our approach fuses image and textual data to enhance the generation process. We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z)
TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations. We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z)
TTS-CGAN: A Transformer Time-Series Conditional GAN for Biosignal Data Augmentation [5.607676459156789]
We present TTS-CGAN, a conditional GAN model that can be trained on existing multi-class datasets and generate class-specific synthetic time-series sequences. Synthetic sequences generated by our model are indistinguishable from real ones, and can be used to complement or replace real signals of the same type.
arXiv Detail & Related papers (2022-06-28T01:01:34Z)
CARLA-GeAR: a Dataset Generator for a Systematic Evaluation of Adversarial Robustness of Vision Models [61.68061613161187]
This paper presents CARLA-GeAR, a tool for the automatic generation of synthetic datasets for evaluating the robustness of neural models against physical adversarial patches. The tool is built on the CARLA simulator, using its Python API, and allows the generation of datasets for several vision tasks in the context of autonomous driving. The paper presents an experimental study to evaluate the performance of some defense methods against such attacks, showing how the datasets generated with CARLA-GeAR might be used in future work as a benchmark for adversarial defense in the real world.
arXiv Detail & Related papers (2022-06-09T09:17:38Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers. We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)
Differentially Private Synthetic Medical Data Generation using Convolutional GANs [7.2372051099165065]
We develop a differentially private framework for synthetic data generation using R'enyi differential privacy. Our approach builds on convolutional autoencoders and convolutional generative adversarial networks to preserve some of the critical characteristics of the generated synthetic data. We demonstrate that our model outperforms existing state-of-the-art models under the same privacy budget.
arXiv Detail & Related papers (2020-12-22T01:03:49Z)
STAN: Synthetic Network Traffic Generation with Generative Neural Models [10.54843182184416]
This paper presents STAN (Synthetic network Traffic generation with Autoregressive Neural models), a tool to generate realistic synthetic network traffic datasets. Our novel neural architecture captures both temporal dependencies and dependence between attributes at any given time. We evaluate the performance of STAN in terms of the quality of data generated, by training it on both a simulated dataset and a real network traffic data set.
arXiv Detail & Related papers (2020-09-27T04:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.