Related papers: Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline Algorithm: Application to the ICU Length of Stay Prediction

Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline Algorithm: Application to the ICU Length of Stay Prediction

URL: http://arxiv.org/abs/2401.00902v1
Date: Sun, 31 Dec 2023 16:01:48 GMT
Title: Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline Algorithm: Application to the ICU Length of Stay Prediction
Authors: Alexandra Kakadiaris
Abstract summary: This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay. The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction. The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
Score: 65.268245109828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the Intensive Care Unit (ICU) length of stay (LOS). Highlighting the critical role of the ICU in managing critically ill patients, the study addresses the growing strain on ICU capacity. It emphasizes the significance of LOS prediction for resource allocation. The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction. While the XGBoost model performs well overall, disparities across race and insurance attributes reflect the need for tailored assessments and continuous monitoring. The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.

Related papers

Detecting Dataset Bias in Medical AI: A Generalized and Modality-Agnostic Auditing Framework [8.520644988801243]
latent bias in machine learning datasets can be amplified during training and/or hidden during testing. We present a data modality-agnostic auditing framework for generating targeted hypotheses about sources of bias. We demonstrate the broad applicability and value of our method by analyzing large-scale medical datasets.
arXiv Detail & Related papers (2025-03-13T02:16:48Z)
Machine Learning-Based Prediction of ICU Readmissions in Intracerebral Hemorrhage Patients: Insights from the MIMIC Databases [0.0]
Intracerebral hemorrhage (ICH) is a life-risking condition characterized by bleeding within the brain parenchyma. This study utilized the Medical Information Mart for Intensive Care (MIMIC-III and MIMIC-IV) databases to develop predictive models for ICU readmission risk.
arXiv Detail & Related papers (2025-01-02T10:19:27Z)
Enhancing In-Hospital Mortality Prediction Using Multi-Representational Learning with LLM-Generated Expert Summaries [3.5508427067904864]
In-hospital mortality (IHM) prediction for ICU patients is critical for timely interventions and efficient resource allocation. This study integrates structured physiological data and clinical notes with Large Language Model (LLM)-generated expert summaries to improve IHM prediction accuracy.
arXiv Detail & Related papers (2024-11-25T16:36:38Z)
Multimodal Pretraining of Medical Time Series and Notes [45.89025874396911]
Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data. We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes. In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
arXiv Detail & Related papers (2023-12-11T21:53:40Z)
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion. It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space. It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z)
A Survey of the Impact of Self-Supervised Pretraining for Diagnostic Tasks with Radiological Images [71.26717896083433]
Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning. This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging.
arXiv Detail & Related papers (2023-09-05T19:45:09Z)
Auditing ICU Readmission Rates in an Clinical Database: An Analysis of Risk Factors and Clinical Outcomes [0.0]
This study presents a machine learning pipeline for clinical data classification in the context of a 30-day readmission problem. The fairness audit uncovers disparities in equal opportunity, predictive parity, false positive rate parity, and false negative rate parity criteria. The study suggests the need for collaborative efforts among researchers, policymakers, and practitioners to address bias and fairness in artificial intelligence (AI) systems.
arXiv Detail & Related papers (2023-04-12T17:09:38Z)
On the Importance of Clinical Notes in Multi-modal Learning for EHR Data [0.0]
Previous research has shown that jointly using clinical notes with electronic health record data improved predictive performance for patient monitoring. We first confirm that performance significantly improves over state-of-the-art EHR data models when combining EHR data and clinical notes. We then provide an analysis showing improvements arise almost exclusively from a subset of notes containing broader context on patient state rather than clinician notes.
arXiv Detail & Related papers (2022-12-06T15:18:57Z)
Predicting Patient Readmission Risk from Medical Text via Knowledge Graph Enhanced Multiview Graph Convolution [67.72545656557858]
We propose a new method that uses medical text of Electronic Health Records for prediction. We represent discharge summaries of patients with multiview graphs enhanced by an external knowledge graph. Experimental results prove the effectiveness of our method, yielding state-of-the-art performance.
arXiv Detail & Related papers (2021-12-19T01:45:57Z)
Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
MIMIC-IF: Interpretability and Fairness Evaluation of Deep Learning Models on MIMIC-IV Dataset [15.436560770086205]
We focus on MIMIC-IV (Medical Information Mart for Intensive Care, version IV), the largest publicly available healthcare dataset. We conduct comprehensive analyses of dataset representation bias as well as interpretability and prediction fairness of deep learning models for in-hospital mortality prediction.
arXiv Detail & Related papers (2021-02-12T20:28:06Z)
Hemogram Data as a Tool for Decision-making in COVID-19 Management: Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients. Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.