A review of Generative Adversarial Networks for Electronic Health
Records: applications, evaluation measures and data sources
- URL: http://arxiv.org/abs/2203.07018v1
- Date: Mon, 14 Mar 2022 11:56:47 GMT
- Title: A review of Generative Adversarial Networks for Electronic Health
Records: applications, evaluation measures and data sources
- Authors: Ghadeer Ghosheh, Jin Li and Tingting Zhu
- Abstract summary: Generative Adversarial Networks (GANs) show great promise in generating synthetic EHR data by learning underlying data distributions.
This work aims to review the major developments in various applications of GANs for EHRs and provides an overview of the proposed methodologies.
We conclude by discussing challenges in GANs for EHRs development and proposing recommended practices.
- Score: 8.319639237899155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Electronic Health Records (EHRs) are a valuable asset to facilitate clinical
research and point of care applications; however, many challenges such as data
privacy concerns impede its optimal utilization. Generative Adversarial
Networks (GANs) show great promise in generating synthetic EHR data by learning
underlying data distributions while achieving excellent performance and
addressing these challenges. This work aims to review the major developments in
various applications of GANs for EHRs and provides an overview of the proposed
methodologies. For this purpose, we combine perspectives from healthcare
applications and machine learning techniques in terms of source datasets and
the fidelity and privacy evaluation of the generated synthetic datasets. We
also compile a list of the metrics and datasets used by the reviewed works,
which can be utilized as benchmarks for future research in the field. We
conclude by discussing challenges in GANs for EHRs development and proposing
recommended practices. We hope that this work motivates novel research
development directions in the intersection of healthcare and machine learning.
Related papers
- Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - Fairness-Optimized Synthetic EHR Generation for Arbitrary Downstream Predictive Tasks [2.089191490381739]
We present a new pipeline that generates synthetic EHR data consistent with (faithful to) the real EHR data.
We demonstrate the effectiveness of our proposed pipeline across various downstream tasks and two different EHR datasets.
Our proposed pipeline can add a widely applicable and complementary tool to the existing toolbox of methods to address fairness in health AI applications.
arXiv Detail & Related papers (2024-06-04T17:29:21Z) - HealthGAT: Node Classifications in Electronic Health Records using Graph Attention Networks [2.2026317523029193]
HealthGAT is a graph attention network framework that generates embeddings from EHR.
Our model iteratively refines the embeddings for medical codes, resulting in improved EHR data analysis.
Our model shows outstanding performance in node classification and downstream tasks such as predicting readmissions and diagnosis classifications.
arXiv Detail & Related papers (2024-03-26T22:17:01Z) - CEHR-GPT: Generating Electronic Health Records with Chronological Patient Timelines [14.386260536090628]
We focus on synthetic data generation and demonstrate the capability of training a GPT model using a particular patient representation.
This enables us to generate patient sequences that can be seamlessly converted to the Observational Medical outcomes Partnership (OMOP) data format.
arXiv Detail & Related papers (2024-02-06T20:58:36Z) - Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics.
Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Machine Learning for Administrative Health Records: A Systematic Review
of Techniques and Applications [5.353552655309808]
Administrative Health Records (AHR) are a subset of EHR collected for administrative purposes.
This paper systematically reviews AHR-based research, analysing 70 relevant studies and spanning multiple databases.
We find that while AHR-based studies are disconnected from each other, the use of AHRs in health informatics research is substantial and accelerating.
arXiv Detail & Related papers (2023-08-27T22:34:10Z) - Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [51.205078179427645]
We investigate the potential of OpenAI's ChatGPT to aid in clinical text mining.
We propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data.
Our method has resulted in significant improvements in the performance of downstream tasks.
arXiv Detail & Related papers (2023-03-08T03:56:31Z) - A Multifaceted Benchmarking of Synthetic Electronic Health Record
Generation Models [15.165156674288623]
We introduce a generalizable benchmarking framework to appraise key characteristics of synthetic health data.
Results show that there is a utility-privacy tradeoff for sharing synthetic EHR data.
arXiv Detail & Related papers (2022-08-02T03:44:45Z) - Opportunities and Challenges of Deep Learning Methods for
Electrocardiogram Data: A Systematic Review [62.490310870300746]
The electrocardiogram (ECG) is one of the most commonly used diagnostic tools in medicine and healthcare.
Deep learning methods have achieved promising results on predictive healthcare tasks using ECG signals.
This paper presents a systematic review of deep learning methods for ECG data from both modeling and application perspectives.
arXiv Detail & Related papers (2019-12-28T02:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.