Related papers: CheXclusion: Fairness gaps in deep chest X-ray classifiers

CheXclusion: Fairness gaps in deep chest X-ray classifiers

URL: http://arxiv.org/abs/2003.00827v2
Date: Fri, 16 Oct 2020 03:26:20 GMT
Title: CheXclusion: Fairness gaps in deep chest X-ray classifiers
Authors: Laleh Seyyed-Kalantari, Guanxiong Liu, Matthew McDermott, Irene Y. Chen, Marzyeh Ghassemi
Abstract summary: We examine the extent to which state-of-the-art deep learning classifiers are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden.
Score: 4.656202572362684
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning systems have received much attention recently for their ability to achieve expert-level performance on clinical tasks, particularly in medical imaging. Here, we examine the extent to which state-of-the-art deep learning classifiers trained to yield diagnostic labels from X-ray images are biased with respect to protected attributes. We train convolution neural networks to predict 14 diagnostic labels in 3 prominent public chest X-ray datasets: MIMIC-CXR, Chest-Xray8, CheXpert, as well as a multi-site aggregation of all those datasets. We evaluate the TPR disparity -- the difference in true positive rates (TPR) -- among different protected attributes such as patient sex, age, race, and insurance type as a proxy for socioeconomic status. We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups. A multi-source dataset corresponds to the smallest disparities, suggesting one way to reduce bias. We find that TPR disparities are not significantly correlated with a subgroup's proportional disease burden. As clinical models move from papers to products, we encourage clinical decision makers to carefully audit for algorithmic disparities prior to deployment. Our code can be found at, https://github.com/LalehSeyyed/CheXclusion

Related papers

Label-Efficient Chest X-ray Diagnosis via Partial CLIP Adaptation [0.0]
This paper proposes a label-efficient strategy for chest X-ray diagnosis.<n>Experiments use the NIH Chest X-ray14 dataset and a pre-trained CLIP ViT-B/32 model.
arXiv Detail & Related papers (2025-07-09T19:57:12Z)
Domain Shift Analysis in Chest Radiographs Classification in a Veterans Healthcare Administration Population [3.4362586245712112]
We used a DenseNet121 model pretrained MIMIC-CXR dataset for deep learning-based multilabel classification. We compared the performance of the 14 chest X-ray labels on the MIMIC-CXR and Veterans Healthcare Administration chest X-ray dataset (VA-CXR) The VA-CXR dataset exhibited lower disagreement rates than the MIMIC-CXR datasets.
arXiv Detail & Related papers (2024-07-30T19:23:29Z)
Long-Tailed Classification of Thorax Diseases on Chest X-Ray: A New Benchmark Study [75.05049024176584]
We present a benchmark study of the long-tailed learning problem in the specific domain of thorax diseases on chest X-rays. We focus on learning from naturally distributed chest X-ray data, optimizing classification accuracy over not only the common "head" classes, but also the rare yet critical "tail" classes. The benchmark consists of two chest X-ray datasets for 19- and 20-way thorax disease classification, containing classes with as many as 53,000 and as few as 7 labeled training images.
arXiv Detail & Related papers (2022-08-29T04:34:15Z)
SCALP -- Supervised Contrastive Learning for Cardiopulmonary Disease Classification and Localization in Chest X-rays using Patient Metadata [10.269187107011934]
We introduce an end-to-end framework, SCALP, which extends the self-supervised contrastive approach to a supervised setting. SCALP pulls together chest X-rays from the same patient (positive keys) and pushes apart chest X-rays from different patients (negative keys) Our experiments demonstrate that SCALP outperforms existing baselines with significant margins in both classification and localization tasks.
arXiv Detail & Related papers (2021-10-27T21:38:12Z)
Explaining COVID-19 and Thoracic Pathology Model Predictions by Identifying Informative Input Features [47.45835732009979]
Neural networks have demonstrated remarkable performance in classification and regression tasks on chest X-rays. Features attribution methods identify the importance of input features for the output prediction. We evaluate our methods using both human-centric (ground-truth-based) interpretability metrics, and human-independent feature importance metrics on NIH Chest X-ray8 and BrixIA datasets.
arXiv Detail & Related papers (2021-04-01T11:42:39Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Chest x-ray automated triage: a semiologic approach designed for clinical implementation, exploiting different types of labels through a combination of four Deep Learning architectures [83.48996461770017]
This work presents a Deep Learning method based on the late fusion of different convolutional architectures. We built four training datasets combining images from public chest x-ray datasets and our institutional archive. We trained four different Deep Learning architectures and combined their outputs with a late fusion strategy, obtaining a unified tool.
arXiv Detail & Related papers (2020-12-23T14:38:35Z)
Learning Invariant Feature Representation to Improve Generalization across Chest X-ray Datasets [55.06983249986729]
We show that a deep learning model performing well when tested on the same dataset as training data starts to perform poorly when it is tested on a dataset from a different source. By employing an adversarial training strategy, we show that a network can be forced to learn a source-invariant representation.
arXiv Detail & Related papers (2020-08-04T07:41:15Z)
Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories. Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z)
Localization of Critical Findings in Chest X-Ray without Local Annotations Using Multi-Instance Learning [0.0]
deep learning models commonly suffer from a lack of explainability. Deep learning models require locally annotated training data in form of pixel level labels or bounding box coordinates. In this work, we address these shortcomings with an interpretable DL algorithm based on multi-instance learning.
arXiv Detail & Related papers (2020-01-23T21:29:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.