Related papers: An XAI-based Analysis of Shortcut Learning in Neural Networks

An XAI-based Analysis of Shortcut Learning in Neural Networks

URL: http://arxiv.org/abs/2504.15664v1
Date: Tue, 22 Apr 2025 07:40:45 GMT
Title: An XAI-based Analysis of Shortcut Learning in Neural Networks
Authors: Phuong Quynh Le, Jörg Schlötterer, Christin Seifert,
Abstract summary: We introduce the neuron spurious score to quantify a neuron's dependence on spurious features.<n>Our results show that spurious features are partially disentangled, but the degree of disentanglement varies across model architectures.<n>Our results lay the groundwork for the development of novel methods to mitigate spurious correlations and make AI models safer to use in practice.
Score: 2.592470112714595
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Machine learning models tend to learn spurious features - features that strongly correlate with target labels but are not causal. Existing approaches to mitigate models' dependence on spurious features work in some cases, but fail in others. In this paper, we systematically analyze how and where neural networks encode spurious correlations. We introduce the neuron spurious score, an XAI-based diagnostic measure to quantify a neuron's dependence on spurious features. We analyze both convolutional neural networks (CNNs) and vision transformers (ViTs) using architecture-specific methods. Our results show that spurious features are partially disentangled, but the degree of disentanglement varies across model architectures. Furthermore, we find that the assumptions behind existing mitigation methods are incomplete. Our results lay the groundwork for the development of novel methods to mitigate spurious correlations and make AI models safer to use in practice.

Related papers

Symbolic regression for defect interactions in 2D materials [0.2721477719641864]
Symbolic regression is a powerful technique for discovering analytical equations that describe data.<n>In this work, we examined the application of the deep symbolic regression algorithm SEGVAE to determine the properties of two-dimensional materials with defects.
arXiv Detail & Related papers (2025-12-23T21:33:11Z)
Impact of Neuron Models on Spiking Neural Networks performance. A Complexity Based Classification Approach [0.0]
This study explores how the selection of neuron models and learning rules impacts the classification performance of Spiking Neural Networks (SNNs)<n>We compare biologically inspired neuron models across multiple learning rules, including spike-timing-dependent plasticity (STDP), tempotron, and reward-modulated updates.<n>A novel element of this work is the integration of a complexity-based decision mechanism into the evaluation pipeline.
arXiv Detail & Related papers (2025-08-24T19:46:59Z)
A Dynamical Systems Perspective on the Analysis of Neural Networks [0.0]
We utilize dynamical systems to analyze several aspects of machine learning algorithms.<n>We demonstrate how to re-formulate a variety of challenges from deep neural networks, (stochastic) gradient descent, and related topics into dynamical statements.
arXiv Detail & Related papers (2025-07-07T16:18:49Z)
Latent Variable Sequence Identification for Cognitive Models with Neural Network Estimators [7.7227297059345466]
We present an approach that extends neural Bayes estimation to learn a direct mapping between experimental data and the targeted latent variable space.<n>Our work underscores that combining recurrent neural networks and simulation-based inference to identify latent variable sequences can enable researchers to access a wider class of cognitive models.
arXiv Detail & Related papers (2024-06-20T21:13:39Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Neural Dependencies Emerging from Learning Massive Categories [94.77992221690742]
This work presents two astonishing findings on neural networks learned for large-scale image classification. 1) Given a well-trained model, the logits predicted for some category can be directly obtained by linearly combining the predictions of a few other categories. 2) Neural dependencies exist not only within a single model, but even between two independently learned models.
arXiv Detail & Related papers (2022-11-21T09:42:15Z)
Investigating Neuron Disturbing in Fusing Heterogeneous Neural Networks [6.389882065284252]
In this paper, we reveal the phenomenon of neuron disturbing, where neurons from heterogeneous local models interfere with each other mutually. We propose an experimental method that excludes neuron disturbing and fuses neural networks via adaptively selecting a local model, called AMS, to execute the prediction.
arXiv Detail & Related papers (2022-10-24T06:47:48Z)
Similarity of Neural Architectures using Adversarial Attack Transferability [47.66096554602005]
We design a quantitative and scalable similarity measure between neural architectures. We conduct a large-scale analysis on 69 state-of-the-art ImageNet classifiers. Our results provide insights into why developing diverse neural architectures with distinct components is necessary.
arXiv Detail & Related papers (2022-10-20T16:56:47Z)
Gaussian Process Surrogate Models for Neural Networks [6.8304779077042515]
In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque. We construct a class of surrogate models for neural networks using Gaussian processes. We demonstrate our approach captures existing phenomena related to the spectral bias of neural networks, and then show that our surrogate models can be used to solve practical problems.
arXiv Detail & Related papers (2022-08-11T20:17:02Z)
Causal Discovery and Knowledge Injection for Contestable Neural Networks (with Appendices) [10.616061367794385]
We propose a two-way interaction whereby neural-network-empowered machines can expose the underpinning learnt causal graphs. We show that our method improves predictive performance up to 2.4x while producing parsimonious networks, up to 7x smaller in the input layer.
arXiv Detail & Related papers (2022-05-19T18:21:12Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
The Causal Neural Connection: Expressiveness, Learnability, and Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation. In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models. We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Persistent Homology Captures the Generalization of Neural Networks Without A Validation Set [0.0]
We suggest studying the training of neural networks with Algebraic Topology, specifically Persistent Homology. Using simplicial complex representations of neural networks, we study the PH diagram distance evolution on the neural network learning process. Results show that the PH diagram distance between consecutive neural network states correlates with the validation accuracy.
arXiv Detail & Related papers (2021-05-31T09:17:31Z)
Learning Variational Data Assimilation Models and Solvers [34.22350850350653]
We introduce end-to-end neural network architectures for data assimilation. A key feature of the proposed end-to-end learning architecture is that we may train the NN models using both supervised and unsupervised strategies.
arXiv Detail & Related papers (2020-07-25T14:28:48Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.