CheXclusion: Fairness gaps in deep chest X-ray classifiers
- URL: http://arxiv.org/abs/2003.00827v2
- Date: Fri, 16 Oct 2020 03:26:20 GMT
- Title: CheXclusion: Fairness gaps in deep chest X-ray classifiers
- Authors: Laleh Seyyed-Kalantari, Guanxiong Liu, Matthew McDermott, Irene Y.
Chen, Marzyeh Ghassemi
- Abstract summary: We examine the extent to which state-of-the-art deep learning classifiers are biased with respect to protected attributes.
We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets.
We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden.
- Score: 4.656202572362684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning systems have received much attention recently for their
ability to achieve expert-level performance on clinical tasks, particularly in
medical imaging. Here, we examine the extent to which state-of-the-art deep
learning classifiers trained to yield diagnostic labels from X-ray images are
biased with respect to protected attributes. We train convolution neural
networks to predict 14 diagnostic labels in 3 prominent public chest X-ray
datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation
of all those datasets. We evaluate the TPR disparity -- the difference in true
positive rates (TPR) -- among different protected attributes such as patient
sex, age, race, and insurance type as a proxy for socioeconomic status. We
demonstrate that TPR disparities exist in the state-of-the-art classifiers in
all datasets, for all clinical tasks, and all subgroups. A multi-source dataset
corresponds to the smallest disparities, suggesting one way to reduce bias. We
find that TPR disparities are not significantly correlated with a subgroup's
proportional disease burden. As clinical models move from papers to products,
we encourage clinical decision makers to carefully audit for algorithmic
disparities prior to deployment. Our code can be found at,
https://github.com/LalehSeyyed/CheXclusion
Related papers
- Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation [0.0]
This paper proposes a label-efficient strategy for chest X-ray diagnosis.<n>Experiments use the NIH Chest X-ray14 dataset and a pre-trained CLIP ViT-B/32 model.
arXiv Detail & Related papers (2025-07-09T19:57:12Z) - Domain Shift Analysis in Chest Radiographs Classification in a Veterans Healthcare Administration Population [3.4362586245712112]
We used a DenseNet121 model pretrained MIMIC-CXR dataset for deep learning-based multilabel classification.
We compared the performance of the 14 chest X-ray labels on the MIMIC-CXR and Veterans Healthcare Administration chest X-ray dataset (VA-CXR)
The VA-CXR dataset exhibited lower disagreement rates than the MIMIC-CXR datasets.
arXiv Detail & Related papers (2024-07-30T19:23:29Z) - Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New
Benchmark Study [75.05049024176584]
We present a benchmark study of the long-tailed learning problem in the specific domain of thorax diseases on chest X-rays.
We focus on learning from naturally distributed chest X-ray data, optimizing classification accuracy over not only the common "head" classes, but also the rare yet critical "tail" classes.
The benchmark consists of two chest X-ray datasets for 19- and 20-way thorax disease classification, containing classes with as many as 53,000 and as few as 7 labeled training images.
arXiv Detail & Related papers (2022-08-29T04:34:15Z) - SCALP -- Supervised Contrastive Learning for Cardiopulmonary Disease
Classification and Localization in Chest X-rays using Patient Metadata [10.269187107011934]
We introduce an end-to-end framework, SCALP, which extends the self-supervised contrastive approach to a supervised setting.
SCALP pulls together chest X-rays from the same patient (positive keys) and pushes apart chest X-rays from different patients (negative keys)
Our experiments demonstrate that SCALP outperforms existing baselines with significant margins in both classification and localization tasks.
arXiv Detail & Related papers (2021-10-27T21:38:12Z) - Explaining COVID-19 and Thoracic Pathology Model Predictions by
Identifying Informative Input Features [47.45835732009979]
Neural networks have demonstrated remarkable performance in classification and regression tasks on chest X-rays.
Features attribution methods identify the importance of input features for the output prediction.
We evaluate our methods using both human-centric (ground-truth-based) interpretability metrics, and human-independent feature importance metrics on NIH Chest X-ray8 and BrixIA datasets.
arXiv Detail & Related papers (2021-04-01T11:42:39Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Chest x-ray automated triage: a semiologic approach designed for
clinical implementation, exploiting different types of labels through a
combination of four Deep Learning architectures [83.48996461770017]
This work presents a Deep Learning method based on the late fusion of different convolutional architectures.
We built four training datasets combining images from public chest x-ray datasets and our institutional archive.
We trained four different Deep Learning architectures and combined their outputs with a late fusion strategy, obtaining a unified tool.
arXiv Detail & Related papers (2020-12-23T14:38:35Z) - Learning Invariant Feature Representation to Improve Generalization
across Chest X-ray Datasets [55.06983249986729]
We show that a deep learning model performing well when tested on the same dataset as training data starts to perform poorly when it is tested on a dataset from a different source.
By employing an adversarial training strategy, we show that a network can be forced to learn a source-invariant representation.
arXiv Detail & Related papers (2020-08-04T07:41:15Z) - Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges.
We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories.
Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z) - Localization of Critical Findings in Chest X-Ray without Local
Annotations Using Multi-Instance Learning [0.0]
deep learning models commonly suffer from a lack of explainability.
Deep learning models require locally annotated training data in form of pixel level labels or bounding box coordinates.
In this work, we address these shortcomings with an interpretable DL algorithm based on multi-instance learning.
arXiv Detail & Related papers (2020-01-23T21:29:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.