A Counterfactual Fair Model for Longitudinal Electronic Health Records
via Deconfounder
- URL: http://arxiv.org/abs/2308.11819v3
- Date: Mon, 2 Oct 2023 17:46:40 GMT
- Title: A Counterfactual Fair Model for Longitudinal Electronic Health Records
via Deconfounder
- Authors: Zheng Liu, Xiaohan Li and Philip Yu
- Abstract summary: We propose a novel model called Fair Longitudinal Medical Deconfounder (FLMD)
FLMD aims to achieve both fairness and accuracy in longitudinal Electronic Health Records (EHR) modeling.
We conducted comprehensive experiments on two real-world EHR datasets to demonstrate the effectiveness of FLMD.
- Score: 5.198621505969445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The fairness issue of clinical data modeling, especially on Electronic Health
Records (EHRs), is of utmost importance due to EHR's complex latent structure
and potential selection bias. It is frequently necessary to mitigate health
disparity while keeping the model's overall accuracy in practice. However,
traditional methods often encounter the trade-off between accuracy and
fairness, as they fail to capture the underlying factors beyond observed data.
To tackle this challenge, we propose a novel model called Fair Longitudinal
Medical Deconfounder (FLMD) that aims to achieve both fairness and accuracy in
longitudinal Electronic Health Records (EHR) modeling. Drawing inspiration from
the deconfounder theory, FLMD employs a two-stage training process. In the
first stage, FLMD captures unobserved confounders for each encounter, which
effectively represents underlying medical factors beyond observed EHR, such as
patient genotypes and lifestyle habits. This unobserved confounder is crucial
for addressing the accuracy/fairness dilemma. In the second stage, FLMD
combines the learned latent representation with other relevant features to make
predictions. By incorporating appropriate fairness criteria, such as
counterfactual fairness, FLMD ensures that it maintains high prediction
accuracy while simultaneously minimizing health disparities. We conducted
comprehensive experiments on two real-world EHR datasets to demonstrate the
effectiveness of FLMD. Apart from the comparison of baseline methods and FLMD
variants in terms of fairness and accuracy, we assessed the performance of all
models on disturbed/imbalanced and synthetic datasets to showcase the
superiority of FLMD across different settings and provide valuable insights
into its capabilities.
Related papers
- FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment.
Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality.
This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - DrFuse: Learning Disentangled Representation for Clinical Multi-Modal
Fusion with Missing Modality and Modal Inconsistency [18.291267748113142]
We propose DrFuse to achieve effective clinical multi-modal fusion.
We address the missing modality issue by disentangling the features shared across modalities and those unique within each modality.
We validate the proposed method using real-world large-scale datasets, MIMIC-IV and MIMIC-CXR.
arXiv Detail & Related papers (2024-03-10T12:41:34Z) - FairEHR-CLP: Towards Fairness-Aware Clinical Predictions with Contrastive Learning in Multimodal Electronic Health Records [15.407593899656762]
We present FairEHR-CLP: a framework for fairness-aware Clinical Predictions with Contrastive Learning in EHRs.
FairEHR-CLP operates through a two-stage process, utilizing patient demographics, longitudinal data, and clinical notes.
We introduce a novel fairness metric to effectively measure error rate disparities across subgroups.
arXiv Detail & Related papers (2024-02-01T19:24:45Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Fair Patient Model: Mitigating Bias in the Patient Representation
Learned from the Electronic Health Records [7.467693938220289]
We applied the proposed model, called Fair Patient Model (FPM), to a sample of 34,739 patients from the MIMIC-III dataset.
FPM outperformed the baseline models in terms of three fairness metrics: demographic parity, equality of opportunity difference, and equalized odds ratio.
arXiv Detail & Related papers (2023-06-05T18:40:35Z) - Mitigating Health Disparities in EHR via Deconfounder [5.511343163506091]
We propose a novel framework, Parity Medical Deconfounder (PriMeD), to deal with the disparity issue in healthcare datasets.
PriMeD adopts a Conditional Variational Autoencoder (CVAE) to learn latent factors (substitute confounders) for observational data.
arXiv Detail & Related papers (2022-10-28T05:16:50Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - EVA: Generating Longitudinal Electronic Health Records Using Conditional
Variational Autoencoders [34.22731849545798]
We propose EHR Variational Autoencoder (EVA) for synthesizing sequences of discrete EHR encounters and encounter features.
We illustrate that EVA can produce realistic sequences, account for individual differences among patients, and can be conditioned on specific disease conditions.
We assess the utility of the methods on large real-world EHR repositories containing over 250, 000 patients.
arXiv Detail & Related papers (2020-12-18T02:37:49Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.