Related papers: Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning

Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning

URL: http://arxiv.org/abs/2402.11237v2
Date: Mon, 26 Aug 2024 10:39:22 GMT
Title: Be Persistent: Towards a Unified Solution for Mitigating Shortcuts in Deep Learning
Authors: Hadi M. Dolatabadi, Sarah M. Erfani, Christopher Leckie,
Abstract summary: Shortcut learning is ubiquitous among many failure cases of neural networks. Finding a unified solution for shortcut learning in DNNs is not out of reach, and TDA can play a significant role in forming such a framework.
Score: 24.200516684111175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) are vulnerable to shortcut learning: rather than learning the intended task, they tend to draw inconclusive relationships between their inputs and outputs. Shortcut learning is ubiquitous among many failure cases of neural networks, and traces of this phenomenon can be seen in their generalizability issues, domain shift, adversarial vulnerability, and even bias towards majority groups. In this paper, we argue that this commonality in the cause of various DNN issues creates a significant opportunity that should be leveraged to find a unified solution for shortcut learning. To this end, we outline the recent advances in topological data analysis (TDA), and persistent homology (PH) in particular, to sketch a unified roadmap for detecting shortcuts in deep learning. We demonstrate our arguments by investigating the topological features of computational graphs in DNNs using two cases of unlearnable examples and bias in decision-making as our test studies. Our analysis of these two failure cases of DNNs reveals that finding a unified solution for shortcut learning in DNNs is not out of reach, and TDA can play a significant role in forming such a framework.

Related papers

On Logical Extrapolation for Mazes with Recurrent and Implicit Networks [2.0037131645168396]
We show that the capacity for extrapolation is less robust than previously suggested. We show that while INNs are capable of generalizing to larger maze instances, they fail to generalize along axes of difficulty other than maze size.
arXiv Detail & Related papers (2024-10-03T22:07:51Z)
Causal inference through multi-stage learning and doubly robust deep neural networks [10.021381302215062]
Deep neural networks (DNNs) have demonstrated remarkable empirical performance in large-scale supervised learning problems. This study delves into the application of DNNs across a wide spectrum of intricate causal inference tasks.
arXiv Detail & Related papers (2024-07-11T14:47:44Z)
Supervised Gradual Machine Learning for Aspect Category Detection [0.9857683394266679]
Aspect Category Detection (ACD) aims to identify implicit and explicit aspects in a given review sentence. We propose a novel approach to tackle the ACD task by combining Deep Neural Networks (DNNs) with Gradual Machine Learning (GML) in a supervised setting.
arXiv Detail & Related papers (2024-04-08T07:21:46Z)
Adversarial Attacks to Latent Representations of Distributed Neural Networks in Split Computing [7.6340310234573465]
Distributed deep neural networks (DNNs) have been shown to reduce the computational burden of mobile devices and decrease the end-to-end inference latency in edge computing scenarios. This paper fills the existing research gap by rigorously analyzing the robustness of distributed DNNs against adversarial action.
arXiv Detail & Related papers (2023-09-29T17:01:29Z)
Verification-Aided Deep Ensemble Selection [4.290931412096984]
Deep neural networks (DNNs) have become the technology of choice for realizing a variety of complex tasks. Even an imperceptible perturbation to a correctly classified input can lead to misclassification by a DNN. This paper devises a methodology for identifying ensemble compositions that are less prone to simultaneous errors.
arXiv Detail & Related papers (2022-02-08T14:36:29Z)
Characterizing possible failure modes in physics-informed neural networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs. We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z)
Reinforcement Learning with External Knowledge by using Logical Neural Networks [67.46162586940905]
A recent neuro-symbolic framework called the Logical Neural Networks (LNNs) can simultaneously provide key-properties of both neural networks and symbolic logic. We propose an integrated method that enables model-free reinforcement learning from external knowledge sources.
arXiv Detail & Related papers (2021-03-03T12:34:59Z)
Deep Neural Networks Are Congestion Games: From Loss Landscape to Wardrop Equilibrium and Beyond [12.622643370707328]
We argue that our work provides a very promising novel tool for analyzing the deep neural networks (DNNs) We show how one can benefit from the classic readily available results from the latter when analyzing the former.
arXiv Detail & Related papers (2020-10-21T14:11:40Z)
A Survey on Assessing the Generalization Envelope of Deep Neural Networks: Predictive Uncertainty, Out-of-distribution and Adversarial Samples [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications. It is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent. This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs.
arXiv Detail & Related papers (2020-08-21T09:12:52Z)
Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings. DNNs are often treated as black box systems, which complicates their evaluation and validation. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations.
arXiv Detail & Related papers (2020-06-30T14:56:05Z)
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice. We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
Adversarial Attacks and Defenses on Graphs: A Review, A Tool and Empirical Studies [73.39668293190019]
Adversary attacks can be easily fooled by small perturbation on the input. Graph Neural Networks (GNNs) have been demonstrated to inherit this vulnerability. In this survey, we categorize existing attacks and defenses, and review the corresponding state-of-the-art methods.
arXiv Detail & Related papers (2020-03-02T04:32:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.