Related papers: Improving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization

Improving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization

URL: http://arxiv.org/abs/2202.00232v7
Date: Wed, 10 Jan 2024 20:20:39 GMT
Title: Improving deep neural network generalization and robustness to background bias via layer-wise relevance propagation optimization
Authors: Pedro R. A. S. Bassi, Sergio S. J. Dertkigil and Andrea Cavalli
Abstract summary: Features in images' backgrounds can spuriously correlate with the images' classes, representing background bias. Deep neural networks (DNNs) that perform well on standard evaluation datasets but generalize poorly to real-world data. We show that the optimization of LRP heatmaps can minimize the background bias influence on deep classifiers.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Features in images' backgrounds can spuriously correlate with the images' classes, representing background bias. They can influence the classifier's decisions, causing shortcut learning (Clever Hans effect). The phenomenon generates deep neural networks (DNNs) that perform well on standard evaluation datasets but generalize poorly to real-world data. Layer-wise Relevance Propagation (LRP) explains DNNs' decisions. Here, we show that the optimization of LRP heatmaps can minimize the background bias influence on deep classifiers, hindering shortcut learning. By not increasing run-time computational cost, the approach is light and fast. Furthermore, it applies to virtually any classification architecture. After injecting synthetic bias in images' backgrounds, we compared our approach (dubbed ISNet) to eight state-of-the-art DNNs, quantitatively demonstrating its superior robustness to background bias. Mixed datasets are common for COVID-19 and tuberculosis classification with chest X-rays, fostering background bias. By focusing on the lungs, the ISNet reduced shortcut learning. Thus, its generalization performance on external (out-of-distribution) test databases significantly surpassed all implemented benchmark models.

Related papers

Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks [37.02386277426315]
This paper aims to disentangle the factors that influence generalization by studying random and SGD-optimized networks that achieve zero training error. We experimentally show, in the low sample regime, that over parameterization in terms of increasing width is beneficial for generalization. For increasing depth, over parameterization is detrimental for generalization, but random and SGD-optimized networks behave similarly, so this can be attributed to an architectural bias.
arXiv Detail & Related papers (2024-07-04T11:29:50Z)
Graph Out-of-Distribution Generalization via Causal Intervention [69.70137479660113]
We introduce a conceptually simple yet principled approach for training robust graph neural networks (GNNs) under node-level distribution shifts. Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor. Our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks.
arXiv Detail & Related papers (2024-02-18T07:49:22Z)
Faster ISNet for Background Bias Mitigation on Deep Neural Networks [0.4915744683251149]
Bias or spurious correlations in image backgrounds can impact neural networks, causing shortcut learning and hampering generalization to real-world data. We propose reformulated architectures whose training time becomes independent from this number. We challenge the proposed architectures using synthetic background bias, and COVID-19 detection in chest X-rays, an application that commonly presents background bias.
arXiv Detail & Related papers (2024-01-16T14:49:26Z)
Chasing Fairness in Graphs: A GNN Architecture Perspective [73.43111851492593]
We propose textsfFair textsfMessage textsfPassing (FMP) designed within a unified optimization framework for graph neural networks (GNNs) In FMP, the aggregation is first adopted to utilize neighbors' information and then the bias mitigation step explicitly pushes demographic group node presentation centers together. Experiments on node classification tasks demonstrate that the proposed FMP outperforms several baselines in terms of fairness and accuracy on three real-world datasets.
arXiv Detail & Related papers (2023-12-19T18:00:15Z)
CLAD: A Contrastive Learning based Approach for Background Debiasing [43.0296255565593]
We introduce a contrastive learning-based approach to mitigate the background bias in CNNs. We achieve state-of-the-art results on the Background Challenge dataset, outperforming the previous benchmark with a margin of 4.1%.
arXiv Detail & Related papers (2022-10-06T08:33:23Z)
Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure. We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Contrasting random and learned features in deep Bayesian linear regression [12.234742322758418]
We study how the ability to learn affects the generalization performance of a simple class of models. By comparing deep random feature models to deep networks in which all layers are trained, we provide a detailed characterization of the interplay between width, depth, data density, and prior mismatch.
arXiv Detail & Related papers (2022-03-01T15:51:29Z)
Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks. We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks. We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.