Demographic Bias of Expert-Level Vision-Language Foundation Models in
Medical Imaging
- URL: http://arxiv.org/abs/2402.14815v1
- Date: Thu, 22 Feb 2024 18:59:53 GMT
- Title: Demographic Bias of Expert-Level Vision-Language Foundation Models in
Medical Imaging
- Authors: Yuzhe Yang, Yujia Liu, Xin Liu, Avanti Gulhane, Domenico Mastrodicasa,
Wei Wu, Edward J Wang, Dushyant W Sahani, Shwetak Patel
- Abstract summary: Self-supervised vision-language foundation models can detect a broad spectrum of pathologies without relying on explicit training annotations.
It is crucial to ensure that these AI models do not mirror or amplify human biases, thereby disadvantaging historically marginalized groups such as females or Black patients.
This study investigates the algorithmic fairness of state-of-the-art vision-language foundation models in chest X-ray diagnosis across five globally-sourced datasets.
- Score: 13.141767097232796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advances in artificial intelligence (AI) have achieved expert-level
performance in medical imaging applications. Notably, self-supervised
vision-language foundation models can detect a broad spectrum of pathologies
without relying on explicit training annotations. However, it is crucial to
ensure that these AI models do not mirror or amplify human biases, thereby
disadvantaging historically marginalized groups such as females or Black
patients. The manifestation of such biases could systematically delay essential
medical care for certain patient subgroups. In this study, we investigate the
algorithmic fairness of state-of-the-art vision-language foundation models in
chest X-ray diagnosis across five globally-sourced datasets. Our findings
reveal that compared to board-certified radiologists, these foundation models
consistently underdiagnose marginalized groups, with even higher rates seen in
intersectional subgroups, such as Black female patients. Such demographic
biases present over a wide range of pathologies and demographic attributes.
Further analysis of the model embedding uncovers its significant encoding of
demographic information. Deploying AI systems with these biases in medical
imaging can intensify pre-existing care disparities, posing potential
challenges to equitable healthcare access and raising ethical questions about
their clinical application.
Related papers
- Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - The Limits of Fair Medical Imaging AI In The Wild [43.97266228706059]
We investigate the extent to which medical AI utilizes demographic encodings.
We confirm that medical imaging AI leverages demographic shortcuts in disease classification.
We find that models with less encoding of demographic attributes are often most "globally optimal"
arXiv Detail & Related papers (2023-12-11T18:59:50Z) - Robust and Interpretable Medical Image Classifiers via Concept
Bottleneck Models [49.95603725998561]
We propose a new paradigm to build robust and interpretable medical image classifiers with natural language concepts.
Specifically, we first query clinical concepts from GPT-4, then transform latent image features into explicit concepts with a vision-language model.
arXiv Detail & Related papers (2023-10-04T21:57:09Z) - Generative models improve fairness of medical classifiers under
distribution shifts [49.10233060774818]
We show that learning realistic augmentations automatically from data is possible in a label-efficient manner using generative models.
We demonstrate that these learned augmentations can surpass ones by making models more robust and statistically fair in- and out-of-distribution.
arXiv Detail & Related papers (2023-04-18T18:15:38Z) - The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular
Dataset of 3.5M Screening and Diagnostic Mammograms [2.243792799100692]
The EMory BrEast imaging dataset contains 3650,000 2D and diagnostic mammograms for 116,000 women divided equally between White and African American patients.
Our goal is to share this dataset with research partners to aid in development and validation of breast AI models that will serve all patients fairly and help decrease bias in medical AI.
arXiv Detail & Related papers (2022-02-08T14:40:59Z) - Generative Residual Attention Network for Disease Detection [51.60842580044539]
We present a novel approach for disease generation in X-rays using a conditional generative adversarial learning.
We generate a corresponding radiology image in a target domain while preserving the identity of the patient.
We then use the generated X-ray image in the target domain to augment our training to improve the detection performance.
arXiv Detail & Related papers (2021-10-25T14:15:57Z) - Reading Race: AI Recognises Patient's Racial Identity In Medical Images [9.287449389763413]
There is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images.
Standard deep learning models can be trained to predict race from medical images with high performance across multiple imaging modalities.
arXiv Detail & Related papers (2021-07-21T21:10:16Z) - An Interpretable Multiple-Instance Approach for the Detection of
referable Diabetic Retinopathy from Fundus Images [72.94446225783697]
We propose a machine learning system for the detection of referable Diabetic Retinopathy in fundus images.
By extracting local information from image patches and combining it efficiently through an attention mechanism, our system is able to achieve high classification accuracy.
We evaluate our approach on publicly available retinal image datasets, in which it exhibits near state-of-the-art performance.
arXiv Detail & Related papers (2021-03-02T13:14:15Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Risk of Training Diagnostic Algorithms on Data with Demographic Bias [0.5599792629509227]
We conduct a survey of the MICCAI 2018 proceedings to investigate the common practice in medical image analysis applications.
Surprisingly, we found that papers focusing on diagnosis rarely describe the demographics of the datasets used.
We show that it is possible to learn unbiased features by explicitly using demographic variables in an adversarial training setup.
arXiv Detail & Related papers (2020-05-20T13:51:01Z) - Heterogeneity Loss to Handle Intersubject and Intrasubject Variability
in Cancer [11.440201348567681]
Deep learning (DL) models have shown impressive results in medical domain.
These AI methods can provide immense support to developing nations as affordable healthcare solutions.
This work is focused on one such application of blood cancer diagnosis.
arXiv Detail & Related papers (2020-03-06T16:16:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.