Deep Learning Applied to Chest X-Rays: Exploiting and Preventing
Shortcuts
- URL: http://arxiv.org/abs/2009.10132v1
- Date: Mon, 21 Sep 2020 18:52:43 GMT
- Title: Deep Learning Applied to Chest X-Rays: Exploiting and Preventing
Shortcuts
- Authors: Sarah Jabbour, David Fouhey, Ella Kazerooni, Michael W. Sjoding, Jenna
Wiens
- Abstract summary: This paper studies the case of spurious class skew in which patients with a particular attribute are spuriously more likely to have the outcome of interest.
We show that deep nets can accurately identify many patient attributes including sex (AUROC = 0.96) and age (AUROC >= 0.90) when learning to predict a diagnosis.
A simple transfer learning approach is surprisingly effective at preventing the shortcut and promoting good performance.
- Score: 11.511323714777298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While deep learning has shown promise in improving the automated diagnosis of
disease based on chest X-rays, deep networks may exhibit undesirable behavior
related to shortcuts. This paper studies the case of spurious class skew in
which patients with a particular attribute are spuriously more likely to have
the outcome of interest. For instance, clinical protocols might lead to a
dataset in which patients with pacemakers are disproportionately likely to have
congestive heart failure. This skew can lead to models that take shortcuts by
heavily relying on the biased attribute. We explore this problem across a
number of attributes in the context of diagnosing the cause of acute hypoxemic
respiratory failure. Applied to chest X-rays, we show that i) deep nets can
accurately identify many patient attributes including sex (AUROC = 0.96) and
age (AUROC >= 0.90), ii) they tend to exploit correlations between such
attributes and the outcome label when learning to predict a diagnosis, leading
to poor performance when such correlations do not hold in the test population
(e.g., everyone in the test set is male), and iii) a simple transfer learning
approach is surprisingly effective at preventing the shortcut and promoting
good generalization performance. On the task of diagnosing congestive heart
failure based on a set of chest X-rays skewed towards older patients (age >=
63), the proposed approach improves generalization over standard training from
0.66 (95% CI: 0.54-0.77) to 0.84 (95% CI: 0.73-0.92) AUROC. While simple, the
proposed approach has the potential to improve the performance of models across
populations by encouraging reliance on clinically relevant manifestations of
disease, i.e., those that a clinician would use to make a diagnosis.
Related papers
- Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - How Does Pruning Impact Long-Tailed Multi-Label Medical Image
Classifiers? [49.35105290167996]
Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance.
This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification.
arXiv Detail & Related papers (2023-08-17T20:40:30Z) - Automatic diagnosis of knee osteoarthritis severity using Swin
transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint.
We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - A Scalable Workflow to Build Machine Learning Classifiers with
Clinician-in-the-Loop to Identify Patients in Specific Diseases [10.658425378457363]
Clinicians may rely on medical coding systems such as International Classification of Diseases (ICD) to identify patients with diseases from Electronic Health Records (EHRs)
Recent studies suggest the ICD codes often cannot characterise patients accurately for specific diseases in real clinical practice.
This paper proposes a scalable workflow which leverages both structured data and unstructured textual notes from EHRs with techniques including NLP, AutoML and Clinician-in-the-Loop mechanism.
arXiv Detail & Related papers (2022-05-18T12:24:07Z) - Similarity-based prediction of Ejection Fraction in Heart Failure
Patients [0.0]
We propose a novel data-driven statistical machine learning approach, named Feature Imputation via Local Likelihood (FILL)
We test our method using a particularly challenging problem: differentiating heart failure patients with reduced versus preserved ejection fraction (HFrEF and HFpEF respectively)
Despite these difficulties, our method is shown to be capable of inferring heart failure patients with HFpEF with a precision above 80% when considering multiple scenarios.
arXiv Detail & Related papers (2022-03-14T14:19:08Z) - Multi-Label Classification of Thoracic Diseases using Dense Convolutional Network on Chest Radiographs [0.0]
We propose a multi-label disease prediction model that allows the detection of more than one pathology at a given test time.
Our proposed model achieved the highest AUC score of 0.896 for the condition Cardiomegaly.
arXiv Detail & Related papers (2022-02-08T00:43:57Z) - A Deep Learning Approach to Predicting Collateral Flow in Stroke
Patients Using Radiomic Features from Perfusion Images [58.17507437526425]
Collateral circulation results from specialized anastomotic channels which provide oxygenated blood to regions with compromised blood flow.
The actual grading is mostly done through manual inspection of the acquired images.
We present a deep learning approach to predicting collateral flow grading in stroke patients based on radiomic features extracted from MR perfusion data.
arXiv Detail & Related papers (2021-10-24T18:58:40Z) - CheXbreak: Misclassification Identification for Deep Learning Models
Interpreting Chest X-rays [5.263502842508203]
We first investigate whether there are patient subgroups that chest x-ray models are likely to misclassify.
Patient age and the radiographic finding of lung lesion or pneumothorax are statistically relevant features for predicting misclassification for some chest x-ray models.
We develop misclassification predictors on chest x-ray models using their outputs and clinical features.
arXiv Detail & Related papers (2021-03-18T00:30:19Z) - Robustness to Spurious Correlations via Human Annotations [100.63051542531171]
We present a framework for making models robust to spurious correlations by leveraging humans' common sense knowledge of causality.
Specifically, we use human annotation to augment each training example with a potential unmeasured variable.
We then introduce a new distributionally robust optimization objective over unmeasured variables (UV-DRO) to control the worst-case loss over possible test-time shifts.
arXiv Detail & Related papers (2020-07-13T20:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.