Write It Like You See It: Detectable Differences in Clinical Notes By
Race Lead To Differential Model Recommendations
- URL: http://arxiv.org/abs/2205.03931v1
- Date: Sun, 8 May 2022 18:24:11 GMT
- Title: Write It Like You See It: Detectable Differences in Clinical Notes By
Race Lead To Differential Model Recommendations
- Authors: Hammaad Adam, Ming Ying Yang, Kenrick Cato, Ioana Baldini, Charles
Senteio, Leo Anthony Celi, Jiaming Zeng, Moninder Singh, Marzyeh Ghassemi
- Abstract summary: We investigate the level of implicit race information available to machine learning models and human experts.
We find that models can identify patient self-reported race from clinical notes even when the notes are stripped of explicit indicators of race.
We show that models trained on these race-redacted clinical notes can still perpetuate existing biases in clinical treatment decisions.
- Score: 15.535251319178379
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clinical notes are becoming an increasingly important data source for machine
learning (ML) applications in healthcare. Prior research has shown that
deploying ML models can perpetuate existing biases against racial minorities,
as bias can be implicitly embedded in data. In this study, we investigate the
level of implicit race information available to ML models and human experts and
the implications of model-detectable differences in clinical notes. Our work
makes three key contributions. First, we find that models can identify patient
self-reported race from clinical notes even when the notes are stripped of
explicit indicators of race. Second, we determine that human experts are not
able to accurately predict patient race from the same redacted clinical notes.
Finally, we demonstrate the potential harm of this implicit information in a
simulation study, and show that models trained on these race-redacted clinical
notes can still perpetuate existing biases in clinical treatment decisions.
Related papers
- Fairness Evolution in Continual Learning for Medical Imaging [47.52603262576663]
We study the behavior of Continual Learning (CL) strategies in medical imaging regarding classification performance.
We evaluate the Replay, Learning without Forgetting (LwF), LwF, and Pseudo-Label strategies.
LwF and Pseudo-Label exhibit optimal classification performance, but when including fairness metrics in the evaluation, it is clear that Pseudo-Label is less biased.
arXiv Detail & Related papers (2024-04-10T09:48:52Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - Making the Most Out of the Limited Context Length: Predictive Power
Varies with Clinical Note Type and Note Section [70.37720062263176]
We propose a framework to analyze the sections with high predictive power.
Using MIMIC-III, we show that: 1) predictive power distribution is different between nursing notes and discharge notes and 2) combining different types of notes could improve performance when the context length is large.
arXiv Detail & Related papers (2023-07-13T20:04:05Z) - This Patient Looks Like That Patient: Prototypical Networks for
Interpretable Diagnosis Prediction from Clinical Text [56.32427751440426]
In clinical practice such models must not only be accurate, but provide doctors with interpretable and helpful results.
We introduce ProtoPatient, a novel method based on prototypical networks and label-wise attention.
We evaluate the model on two publicly available clinical datasets and show that it outperforms existing baselines.
arXiv Detail & Related papers (2022-10-16T10:12:07Z) - Avoiding Biased Clinical Machine Learning Model Performance Estimates in
the Presence of Label Selection [3.3944964838781093]
We describe three classes of label selection and simulate five causally distinct scenarios to assess how particular selection mechanisms bias a suite of commonly reported binary machine learning model performance metrics.
We find that naive estimates of AUROC on the observed population undershoot actual performance by up to 20%.
Such a disparity could be large enough to lead to the wrongful termination of a successful clinical decision support tool.
arXiv Detail & Related papers (2022-09-15T22:30:14Z) - Disparate Censorship & Undertesting: A Source of Label Bias in Clinical
Machine Learning [14.133370438685969]
Disparate censorship in patients of equivalent risk leads to undertesting in certain groups, and in turn, more biased labels for such groups.
Our findings call attention to disparate censorship as a source of label bias in clinical ML models.
arXiv Detail & Related papers (2022-08-01T20:15:31Z) - What Do You See in this Patient? Behavioral Testing of Clinical NLP
Models [69.09570726777817]
We introduce an extendable testing framework that evaluates the behavior of clinical outcome models regarding changes of the input.
We show that model behavior varies drastically even when fine-tuned on the same data and that allegedly best-performing models have not always learned the most medically plausible patterns.
arXiv Detail & Related papers (2021-11-30T15:52:04Z) - Algorithmic encoding of protected characteristics and its implications
on disparities across subgroups [17.415882865534638]
Machine learning models may pick up undesirable correlations between a patient's racial identity and clinical outcome.
Very little is known about how these biases are encoded and how one may reduce or even remove disparate performance.
arXiv Detail & Related papers (2021-10-27T20:30:57Z) - Reading Race: AI Recognises Patient's Racial Identity In Medical Images [9.287449389763413]
There is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images.
Standard deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities.
arXiv Detail & Related papers (2021-07-21T21:10:16Z) - Interpretable bias mitigation for textual data: Reducing gender bias in
patient notes while maintaining classification performance [0.11545092788508224]
We identify and remove gendered language from two clinical-note datasets.
We show minimal degradation in health condition classification tasks for low- to medium-levels of bias removal via data augmentation.
This work outlines an interpretable approach for using data augmentation to identify and reduce the potential for bias in natural language processing pipelines.
arXiv Detail & Related papers (2021-03-10T03:09:30Z) - Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.
Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model.
We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.