Auditing Algorithmic Fairness in Machine Learning for Health with
Severity-Based LOGAN
- URL: http://arxiv.org/abs/2211.08742v1
- Date: Wed, 16 Nov 2022 08:04:12 GMT
- Title: Auditing Algorithmic Fairness in Machine Learning for Health with
Severity-Based LOGAN
- Authors: Anaelia Ovalle, Sunipa Dev, Jieyu Zhao, Majid Sarrafzadeh, Kai-Wei
Chang
- Abstract summary: We propose supplementing machine learning-based (ML) healthcare tools for bias with SLOGAN, an automatic tool for capturing local biases in a clinical prediction task.
LOGAN adapts an existing tool, LOcal Group biAs detectioN, by contextualizing group bias detection in patient illness severity and past medical history.
On average, SLOGAN identifies larger fairness disparities in over 75% of patient groups than LOGAN while maintaining clustering quality.
- Score: 70.76142503046782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Auditing machine learning-based (ML) healthcare tools for bias is critical to
preventing patient harm, especially in communities that disproportionately face
health inequities. General frameworks are becoming increasingly available to
measure ML fairness gaps between groups. However, ML for health (ML4H) auditing
principles call for a contextual, patient-centered approach to model
assessment. Therefore, ML auditing tools must be (1) better aligned with ML4H
auditing principles and (2) able to illuminate and characterize communities
vulnerable to the most harm. To address this gap, we propose supplementing ML4H
auditing frameworks with SLOGAN (patient Severity-based LOcal Group biAs
detectioN), an automatic tool for capturing local biases in a clinical
prediction task. SLOGAN adapts an existing tool, LOGAN (LOcal Group biAs
detectioN), by contextualizing group bias detection in patient illness severity
and past medical history. We investigate and compare SLOGAN's bias detection
capabilities to LOGAN and other clustering techniques across patient subgroups
in the MIMIC-III dataset. On average, SLOGAN identifies larger fairness
disparities in over 75% of patient groups than LOGAN while maintaining
clustering quality. Furthermore, in a diabetes case study, health disparity
literature corroborates the characterizations of the most biased clusters
identified by SLOGAN. Our results contribute to the broader discussion of how
machine learning biases may perpetuate existing healthcare disparities.
Related papers
- Aligning (Medical) LLMs for (Counterfactual) Fairness [2.089191490381739]
Large Language Models (LLMs) have emerged as promising solutions for medical and clinical decision support applications.
LLMs are subject to different types of biases, which can lead to unfair treatment of individuals, worsening health disparities, and reducing trust in AI-augmented medical tools.
We present a new model alignment approach for aligning LLMs using a preference optimization method within a knowledge distillation framework.
arXiv Detail & Related papers (2024-08-22T01:11:27Z) - Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias [3.455189439319919]
We introduce Cross-Care, the first benchmark framework dedicated to assessing biases and real world knowledge in large language models (LLMs)
We evaluate how demographic biases embedded in pre-training corpora like $ThePile$ influence the outputs of LLMs.
Our results highlight substantial misalignment between LLM representation of disease prevalence and real disease prevalence rates across demographic subgroups.
arXiv Detail & Related papers (2024-05-09T02:33:14Z) - How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance [64.1656365676171]
Group imbalance has been a known problem in empirical risk minimization.
This paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance.
arXiv Detail & Related papers (2024-03-12T04:38:05Z) - Evaluating the Fairness of the MIMIC-IV Dataset and a Baseline
Algorithm: Application to the ICU Length of Stay Prediction [65.268245109828]
This paper uses the MIMIC-IV dataset to examine the fairness and bias in an XGBoost binary classification model predicting the ICU length of stay.
The research reveals class imbalances in the dataset across demographic attributes and employs data preprocessing and feature extraction.
The paper concludes with recommendations for fairness-aware machine learning techniques for mitigating biases and the need for collaborative efforts among healthcare professionals and data scientists.
arXiv Detail & Related papers (2023-12-31T16:01:48Z) - Auditing ICU Readmission Rates in an Clinical Database: An Analysis of
Risk Factors and Clinical Outcomes [0.0]
This study presents a machine learning pipeline for clinical data classification in the context of a 30-day readmission problem.
The fairness audit uncovers disparities in equal opportunity, predictive parity, false positive rate parity, and false negative rate parity criteria.
The study suggests the need for collaborative efforts among researchers, policymakers, and practitioners to address bias and fairness in artificial intelligence (AI) systems.
arXiv Detail & Related papers (2023-04-12T17:09:38Z) - Detecting Shortcut Learning for Fair Medical AI using Shortcut Testing [62.9062883851246]
Machine learning holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities.
One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data.
Using multi-task learning, we propose the first method to assess and mitigate shortcut learning as a part of the fairness assessment of clinical ML systems.
arXiv Detail & Related papers (2022-07-21T09:35:38Z) - Fair Machine Learning in Healthcare: A Review [90.22219142430146]
We analyze the intersection of fairness in machine learning and healthcare disparities.
We provide a critical review of the associated fairness metrics from a machine learning standpoint.
We propose several new research directions that hold promise for developing ethical and equitable ML applications in healthcare.
arXiv Detail & Related papers (2022-06-29T04:32:10Z) - Assessing Social Determinants-Related Performance Bias of Machine
Learning Models: A case of Hyperchloremia Prediction in ICU Population [6.8473641147443995]
We evaluated four classifiers built to predict Hyperchloremia, a condition that often results from aggressive fluids administration in the ICU population.
We observed that adding social determinants features in addition to the lab-based ones improved model performance on all patients.
We urge future researchers to design models that proactively adjust for potential biases and include subgroup reporting.
arXiv Detail & Related papers (2021-11-18T03:58:50Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.