Related papers: Ensuring Medical AI Safety: Explainable AI-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data

Ensuring Medical AI Safety: Explainable AI-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data

URL: http://arxiv.org/abs/2501.13818v1
Date: Thu, 23 Jan 2025 16:39:09 GMT
Title: Ensuring Medical AI Safety: Explainable AI-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data
Authors: Frederik Pahde, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek,
Abstract summary: We introduce a semi-automated framework for the identification of spurious behavior from both data and model perspective.<n>This allows the retrieval of spurious data points and the detection of model circuits that encode the associated prediction rules.<n>We show the applicability of our framework using four medical datasets, featuring controlled and real-world spurious correlations.
Score: 14.991686165405959
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks are increasingly employed in high-stakes medical applications, despite their tendency for shortcut learning in the presence of spurious correlations, which can have potentially fatal consequences in practice. Detecting and mitigating shortcut behavior is a challenging task that often requires significant labeling efforts from domain experts. To alleviate this problem, we introduce a semi-automated framework for the identification of spurious behavior from both data and model perspective by leveraging insights from eXplainable Artificial Intelligence (XAI). This allows the retrieval of spurious data points and the detection of model circuits that encode the associated prediction rules. Moreover, we demonstrate how these shortcut encodings can be used for XAI-based sample- and pixel-level data annotation, providing valuable information for bias mitigation methods to unlearn the undesired shortcut behavior. We show the applicability of our framework using four medical datasets across two modalities, featuring controlled and real-world spurious correlations caused by data artifacts. We successfully identify and mitigate these biases in VGG16, ResNet50, and contemporary Vision Transformer models, ultimately increasing their robustness and applicability for real-world medical tasks.

Related papers

Detecting Dataset Bias in Medical AI: A Generalized and Modality-Agnostic Auditing Framework [8.520644988801243]
latent bias in machine learning datasets can be amplified during training and/or hidden during testing. We present a data modality-agnostic auditing framework for generating targeted hypotheses about sources of bias. We demonstrate the broad applicability and value of our method by analyzing large-scale medical datasets.
arXiv Detail & Related papers (2025-03-13T02:16:48Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Detecting Spurious Correlations via Robust Visual Concepts in Real and AI-Generated Image Classification [12.992095539058022]
We introduce a general-purpose method that efficiently detects potential spurious correlations. The proposed method provides intuitive explanations while eliminating the need for pixel-level annotations. Our method is also suitable for detecting spurious correlations that may propagate to downstream applications originating from generative models.
arXiv Detail & Related papers (2023-11-03T01:12:35Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Shortcut Detection with Variational Autoencoders [1.3174512123890016]
We present a novel approach to detect shortcuts in image and audio datasets by leveraging variational autoencoders (VAEs) The disentanglement of features in the latent space of VAEs allows us to discover feature-target correlations in datasets and semi-automatically evaluate them for ML shortcuts. We demonstrate the applicability of our method on several real-world datasets and identify shortcuts that have not been discovered before.
arXiv Detail & Related papers (2023-02-08T18:26:10Z)
Causal Discovery and Knowledge Injection for Contestable Neural Networks (with Appendices) [10.616061367794385]
We propose a two-way interaction whereby neural-network-empowered machines can expose the underpinning learnt causal graphs. We show that our method improves predictive performance up to 2.4x while producing parsimonious networks, up to 7x smaller in the input layer.
arXiv Detail & Related papers (2022-05-19T18:21:12Z)
Towards Unbiased Visual Emotion Recognition via Causal Intervention [63.74095927462]
We propose a novel Emotion Recognition Network (IERN) to alleviate the negative effects brought by the dataset bias. A series of designed tests validate the effectiveness of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms other state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-26T10:40:59Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI) PPI leverages proactive interventions to guard against image features with no causal relevance. We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
Unsupervised Multi-Modal Representation Learning for Affective Computing with Multi-Corpus Wearable Data [16.457778420360537]
We propose an unsupervised framework to reduce the reliance on human supervision. The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram (ECG) and electrodermal activity (EDA) signals. Our method outperforms current state-of-the-art results that have performed arousal detection on the same datasets.
arXiv Detail & Related papers (2020-08-24T22:01:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.