Unlearning Spurious Correlations in Chest X-ray Classification
- URL: http://arxiv.org/abs/2308.01119v2
- Date: Thu, 3 Aug 2023 19:27:16 GMT
- Title: Unlearning Spurious Correlations in Chest X-ray Classification
- Authors: Misgina Tsighe Hagos, Kathleen M. Curran, Brian Mac Namee
- Abstract summary: We train a deep learning model using a Covid-19 chest X-ray dataset.
We show how this dataset can lead to spurious correlations due to unintended confounding regions.
XBL is a deep learning approach that goes beyond interpretability by utilizing model explanations to interactively unlearn spurious correlations.
- Score: 4.039245878626345
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Medical image classification models are frequently trained using training
datasets derived from multiple data sources. While leveraging multiple data
sources is crucial for achieving model generalization, it is important to
acknowledge that the diverse nature of these sources inherently introduces
unintended confounders and other challenges that can impact both model accuracy
and transparency. A notable confounding factor in medical image classification,
particularly in musculoskeletal image classification, is skeletal
maturation-induced bone growth observed during adolescence. We train a deep
learning model using a Covid-19 chest X-ray dataset and we showcase how this
dataset can lead to spurious correlations due to unintended confounding
regions. eXplanation Based Learning (XBL) is a deep learning approach that goes
beyond interpretability by utilizing model explanations to interactively
unlearn spurious correlations. This is achieved by integrating interactive user
feedback, specifically feature annotations. In our study, we employed two
non-demanding manual feedback mechanisms to implement an XBL-based approach for
effectively eliminating these spurious correlations. Our results underscore the
promising potential of XBL in constructing robust models even in the presence
of confounding factors.
Related papers
- Uncertainty-Error correlations in Evidential Deep Learning models for biomedical segmentation [0.0]
Evidential Deep Learning is applied in the context of biomedical image segmentation.
We found that Evidential Deep Learning models with U-Net backbones generally yielded superior correlations between prediction errors and uncertainties.
These superior features of EDL models render them well-suited for segmentation tasks that warrant a critical sensitivity in detecting large model errors.
arXiv Detail & Related papers (2024-10-24T06:16:04Z) - Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance.
We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes.
To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Surgical Aggregation: Federated Class-Heterogeneous Learning [4.468858802955592]
We propose surgical aggregation, a federated learning framework for aggregating knowledge from class-heterogeneous datasets.
We evaluate our method using simulated and real-world class-heterogeneous datasets across both independent and identically distributed (iid) and non-iid settings.
arXiv Detail & Related papers (2023-01-17T03:53:29Z) - De-Biasing Generative Models using Counterfactual Methods [0.0]
We propose a new decoder based framework named the Causal Counterfactual Generative Model (CCGM)
Our proposed method combines a causal latent space VAE model with specific modification to emphasize causal fidelity.
We explore how better disentanglement of causal learning and encoding/decoding generates higher causal intervention quality.
arXiv Detail & Related papers (2022-07-04T16:53:20Z) - CXR-FL: Deep Learning-based Chest X-ray Image Analysis Using Federated
Learning [0.0]
We present an evaluation of deep learning-based models for chest X-ray image analysis using the federated learning method.
We show that classification models perform worse if trained on a region of interest reduced to segmentation of the lung compared to the full image.
arXiv Detail & Related papers (2022-04-11T15:47:54Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Deep Mining External Imperfect Data for Chest X-ray Disease Screening [57.40329813850719]
We argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges.
We formulate the multi-label disease classification problem as weighted independent binary tasks according to the categories.
Our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability.
arXiv Detail & Related papers (2020-06-06T06:48:40Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z) - Debiasing Skin Lesion Datasets and Models? Not So Fast [17.668005682385175]
Models learned from data risk learning biases from that same data.
When models learn spurious correlations not found in real-world situations, their deployment for critical tasks, such as medical decisions, can be catastrophic.
We find out that, despite interesting results that point to promising future research, current debiasing methods are not ready to solve the bias issue for skin-lesion models.
arXiv Detail & Related papers (2020-04-23T21:07:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.