Related papers: Deep Learning Applied to Chest X-Rays: Exploiting and Preventing Shortcuts

Deep Learning Applied to Chest X-Rays: Exploiting and Preventing Shortcuts

URL: http://arxiv.org/abs/2009.10132v1
Date: Mon, 21 Sep 2020 18:52:43 GMT
Title: Deep Learning Applied to Chest X-Rays: Exploiting and Preventing Shortcuts
Authors: Sarah Jabbour, David Fouhey, Ella Kazerooni, Michael W. Sjoding, Jenna Wiens
Abstract summary: This paper studies the case of spurious class skew in which patients with a particular attribute are spuriously more likely to have the outcome of interest. We show that deep nets can accurately identify many patient attributes including sex (AUROC = 0.96) and age (AUROC >= 0.90) when learning to predict a diagnosis. A simple transfer learning approach is surprisingly effective at preventing the shortcut and promoting good performance.
Score: 11.511323714777298
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While deep learning has shown promise in improving the automated diagnosis of disease based on chest X-rays, deep networks may exhibit undesirable behavior related to shortcuts. This paper studies the case of spurious class skew in which patients with a particular attribute are spuriously more likely to have the outcome of interest. For instance, clinical protocols might lead to a dataset in which patients with pacemakers are disproportionately likely to have congestive heart failure. This skew can lead to models that take shortcuts by heavily relying on the biased attribute. We explore this problem across a number of attributes in the context of diagnosing the cause of acute hypoxemic respiratory failure. Applied to chest X-rays, we show that i) deep nets can accurately identify many patient attributes including sex (AUROC = 0.96) and age (AUROC >= 0.90), ii) they tend to exploit correlations between such attributes and the outcome label when learning to predict a diagnosis, leading to poor performance when such correlations do not hold in the test population (e.g., everyone in the test set is male), and iii) a simple transfer learning approach is surprisingly effective at preventing the shortcut and promoting good generalization performance. On the task of diagnosing congestive heart failure based on a set of chest X-rays skewed towards older patients (age >= 63), the proposed approach improves generalization over standard training from 0.66 (95% CI: 0.54-0.77) to 0.84 (95% CI: 0.73-0.92) AUROC. While simple, the proposed approach has the potential to improve the performance of models across populations by encouraging reliance on clinically relevant manifestations of disease, i.e., those that a clinician would use to make a diagnosis.

Related papers

Artificial Intelligence-Driven Prognostic Classification of COVID-19 Using Chest X-rays: A Deep Learning Approach [0.0]
This study presents a high-accuracy deep learning model for classifying COVID-19 severity (Mild, Moderate, and Severe) using Chest X-ray images. Our model achieved an average accuracy of 97%, with specificity of 99%, sensitivity of 87%, and an F1-score of 93.11%. These results demonstrate the model's potential for real-world clinical applications.
arXiv Detail & Related papers (2025-03-17T15:27:21Z)
Fast-staged CNN Model for Accurate pulmonary diseases and Lung cancer detection [0.0]
This research evaluates a deep learning model designed to detect lung cancer, specifically pulmonary nodules, along with eight other lung pathologies, using chest radiographs. A two-stage classification system, utilizing ensemble methods and transfer learning, is employed to first triage images into Normal or Abnormal. The model achieves notable results in classification, with a top-performing accuracy of 77%, a sensitivity of 0.713, a specificity of 0.776 during external validation, and an AUC score of 0.888.
arXiv Detail & Related papers (2024-12-16T11:47:07Z)
Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates. Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood. This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z)
How Does Pruning Impact Long-Tailed Multi-Label Medical Image Classifiers? [49.35105290167996]
Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance. This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification.
arXiv Detail & Related papers (2023-08-17T20:40:30Z)
Automatic diagnosis of knee osteoarthritis severity using Swin transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint. We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z)
Deep Reinforcement Learning Framework for Thoracic Diseases Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis. We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents. Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z)
Learning to diagnose cirrhosis from radiological and histological labels with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset. We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis. This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z)
Self-supervised contrastive learning of echocardiogram videos enables label-efficient cardiac disease diagnosis [48.64462717254158]
We developed a self-supervised contrastive learning approach, EchoCLR, to catered to echocardiogram videos. When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improved classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS) EchoCLR is unique in its ability to learn representations of medical videos and demonstrates that SSL can enable label-efficient disease classification from small, labeled datasets.
arXiv Detail & Related papers (2022-07-23T19:17:26Z)
A Scalable Workflow to Build Machine Learning Classifiers with Clinician-in-the-Loop to Identify Patients in Specific Diseases [10.658425378457363]
Clinicians may rely on medical coding systems such as International Classification of Diseases (ICD) to identify patients with diseases from Electronic Health Records (EHRs) Recent studies suggest the ICD codes often cannot characterise patients accurately for specific diseases in real clinical practice. This paper proposes a scalable workflow which leverages both structured data and unstructured textual notes from EHRs with techniques including NLP, AutoML and Clinician-in-the-Loop mechanism.
arXiv Detail & Related papers (2022-05-18T12:24:07Z)
Similarity-based prediction of Ejection Fraction in Heart Failure Patients [0.0]
We propose a novel data-driven statistical machine learning approach, named Feature Imputation via Local Likelihood (FILL) We test our method using a particularly challenging problem: differentiating heart failure patients with reduced versus preserved ejection fraction (HFrEF and HFpEF respectively) Despite these difficulties, our method is shown to be capable of inferring heart failure patients with HFpEF with a precision above 80% when considering multiple scenarios.
arXiv Detail & Related papers (2022-03-14T14:19:08Z)
Multi-Label Classification of Thoracic Diseases using Dense Convolutional Network on Chest Radiographs [0.0]
We propose a multi-label disease prediction model that allows the detection of more than one pathology at a given test time. Our proposed model achieved the highest AUC score of 0.896 for the condition Cardiomegaly.
arXiv Detail & Related papers (2022-02-08T00:43:57Z)
A Deep Learning Approach to Predicting Collateral Flow in Stroke Patients Using Radiomic Features from Perfusion Images [58.17507437526425]
Collateral circulation results from specialized anastomotic channels which provide oxygenated blood to regions with compromised blood flow. The actual grading is mostly done through manual inspection of the acquired images. We present a deep learning approach to predicting collateral flow grading in stroke patients based on radiomic features extracted from MR perfusion data.
arXiv Detail & Related papers (2021-10-24T18:58:40Z)
CheXbreak: Misclassification Identification for Deep Learning Models Interpreting Chest X-rays [5.263502842508203]
We first investigate whether there are patient subgroups that chest x-ray models are likely to misclassify. Patient age and the radiographic finding of lung lesion or pneumothorax are statistically relevant features for predicting misclassification for some chest x-ray models. We develop misclassification predictors on chest x-ray models using their outputs and clinical features.
arXiv Detail & Related papers (2021-03-18T00:30:19Z)
Robustness to Spurious Correlations via Human Annotations [100.63051542531171]
We present a framework for making models robust to spurious correlations by leveraging humans' common sense knowledge of causality. Specifically, we use human annotation to augment each training example with a potential unmeasured variable. We then introduce a new distributionally robust optimization objective over unmeasured variables (UV-DRO) to control the worst-case loss over possible test-time shifts.
arXiv Detail & Related papers (2020-07-13T20:05:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.