Synthetic Observational Health Data with GANs: from slow adoption to a
boom in medical research and ultimately digital twins?
- URL: http://arxiv.org/abs/2005.13510v3
- Date: Thu, 19 Nov 2020 05:27:34 GMT
- Title: Synthetic Observational Health Data with GANs: from slow adoption to a
boom in medical research and ultimately digital twins?
- Authors: Jeremy Georges-Filteau, Elisa Cirillo
- Abstract summary: Vast potential is unexploited because of the fiercely private nature of patient-related data and regulations to protect it.
Generative Adversarial Networks (GANs) have recently emerged as a groundbreaking way to learn generative models that produce realistic synthetic data.
GANs posses capabilities relevant to common problems in healthcare: lack of data, class imbalance, rare diseases, and preserving privacy.
- Score: 0.16244541005112745
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: After being collected for patient care, Observational Health Data (OHD) can
further benefit patient well-being by sustaining the development of health
informatics and medical research. Vast potential is unexploited because of the
fiercely private nature of patient-related data and regulations to protect it.
Generative Adversarial Networks (GANs) have recently emerged as a
groundbreaking way to learn generative models that produce realistic synthetic
data. They have revolutionized practices in multiple domains such as
self-driving cars, fraud detection, digital twin simulations in industrial
sectors, and medical imaging.
The digital twin concept could readily apply to modelling and quantifying
disease progression. In addition, GANs posses many capabilities relevant to
common problems in healthcare: lack of data, class imbalance, rare diseases,
and preserving privacy. Unlocking open access to privacy-preserving OHD could
be transformative for scientific research. In the midst of COVID-19, the
healthcare system is facing unprecedented challenges, many of which of are data
related for the reasons stated above.
Considering these facts, publications concerning GAN applied to OHD seemed to
be severely lacking. To uncover the reasons for this slow adoption, we broadly
reviewed the published literature on the subject. Our findings show that the
properties of OHD were initially challenging for the existing GAN algorithms
(unlike medical imaging, for which state-of-the-art model were directly
transferable) and the evaluation synthetic data lacked clear metrics.
We find more publications on the subject than expected, starting slowly in
2017, and since then at an increasing rate. The difficulties of OHD remain, and
we discuss issues relating to evaluation, consistency, benchmarking, data
modelling, and reproducibility.
Related papers
- Synthetic data: How could it be used for infectious disease research? [0.16752458252726457]
Concerns have been raised about potential negative factors associated with the possibilities of artificial dataset generation.
These include the potential misuse of generative artificial intelligence in fields such as cybercrime.
Synthetic data offers significant benefits, particularly in data privacy, research, in balancing datasets and reducing bias in machine learning models.
arXiv Detail & Related papers (2024-07-03T17:13:04Z) - A Survey of Artificial Intelligence in Gait-Based Neurodegenerative Disease Diagnosis [51.07114445705692]
neurodegenerative diseases (NDs) traditionally require extensive healthcare resources and human effort for medical diagnosis and monitoring.
As a crucial disease-related motor symptom, human gait can be exploited to characterize different NDs.
The current advances in artificial intelligence (AI) models enable automatic gait analysis for NDs identification and classification.
arXiv Detail & Related papers (2024-05-21T06:44:40Z) - Time-aware Heterogeneous Graph Transformer with Adaptive Attention Merging for Health Event Prediction [6.578298085691462]
We introduce a novel heterogeneous graph learning model designed to assimilate disease domain knowledge and elucidate the intricate relationships between drugs and diseases.
When evaluated on two healthcare datasets, our approach demonstrated notable enhancements in both prediction accuracy and interpretability.
arXiv Detail & Related papers (2024-04-23T08:01:30Z) - Generative AI-Driven Human Digital Twin in IoT-Healthcare: A Comprehensive Survey [53.691704671844406]
The Internet of things (IoT) can significantly enhance the quality of human life, specifically in healthcare.
The human digital twin (HDT) is proposed as an innovative paradigm that can comprehensively characterize the replication of the individual human body.
HDT is envisioned to empower IoT-healthcare beyond the application of healthcare monitoring by acting as a versatile and vivid human digital testbed.
Recently, generative artificial intelligence (GAI) may be a promising solution because it can leverage advanced AI algorithms to automatically create, manipulate, and modify valuable while diverse data.
arXiv Detail & Related papers (2024-01-22T03:17:41Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Balancing Privacy and Progress in Artificial Intelligence: Anonymization
in Histopathology for Biomedical Research and Education [1.8078387709049526]
Transferring medical data "as open as possible" poses a risk to patient privacy.
Existing regulations push towards keeping medical data "as closed as necessary" to avoid re-identification risks.
This paper explores the legal regulations and terminologies for medical data-sharing.
arXiv Detail & Related papers (2023-07-18T16:53:07Z) - Synthetic Data in Healthcare [10.555189948915492]
We present the cases for physical and statistical simulations for creating data and the proposed applications in healthcare and medicine.
We discuss that while synthetics can promote privacy, equity, safety and continual and causal learning, they also run the risk of introducing flaws, blind spots and propagating or exaggerating biases.
arXiv Detail & Related papers (2023-04-06T17:23:39Z) - FakeNews: GAN-based generation of realistic 3D volumetric data -- A
systematic review and taxonomy [2.801317303396674]
Generative Adversarial Networks (GANs) are used to generate realistic synthetic data.
In this review, we provide a summary of works that generate realistic volumetric synthetic data using GANs.
arXiv Detail & Related papers (2022-07-04T13:14:37Z) - When Accuracy Meets Privacy: Two-Stage Federated Transfer Learning
Framework in Classification of Medical Images on Limited Data: A COVID-19
Case Study [77.34726150561087]
COVID-19 pandemic has spread rapidly and caused a shortage of global medical resources.
CNN has been widely utilized and verified in analyzing medical images.
arXiv Detail & Related papers (2022-03-24T02:09:41Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.