Meet You Halfway: Explaining Deep Learning Mysteries
- URL: http://arxiv.org/abs/2206.04463v1
- Date: Thu, 9 Jun 2022 12:43:10 GMT
- Title: Meet You Halfway: Explaining Deep Learning Mysteries
- Authors: Oriel BenShmuel
- Abstract summary: We introduce a new conceptual framework attached with a formal description that aims to shed light on the network's behavior.
We clarify: Why do neural networks acquire generalization abilities?
We provide a comprehensive set of experiments that support this new framework, as well as its underlying theory.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks perform exceptionally well on various learning tasks
with state-of-the-art results. While these models are highly expressive and
achieve impressively accurate solutions with excellent generalization
abilities, they are susceptible to minor perturbations. Samples that suffer
such perturbations are known as "adversarial examples". Even though deep
learning is an extensively researched field, many questions about the nature of
deep learning models remain unanswered. In this paper, we introduce a new
conceptual framework attached with a formal description that aims to shed light
on the network's behavior and interpret the behind-the-scenes of the learning
process. Our framework provides an explanation for inherent questions
concerning deep learning. Particularly, we clarify: (1) Why do neural networks
acquire generalization abilities? (2) Why do adversarial examples transfer
between different models?. We provide a comprehensive set of experiments that
support this new framework, as well as its underlying theory.
Related papers
- Theoretical Understanding of Learning from Adversarial Perturbations [30.759348459463467]
It is not fully understood why adversarial examples can deceive neural networks and transfer between different networks.
We provide a theoretical framework for understanding learning from perturbations using a one-hidden-layer network.
Our results highlight that various adversarial perturbations, even perturbations of a few pixels, contain sufficient class features for generalization.
arXiv Detail & Related papers (2024-02-16T06:22:44Z) - On information captured by neural networks: connections with
memorization and generalization [4.082286997378594]
We study information captured by neural networks during training.
We relate example informativeness to generalization by deriving nonvacuous generalization gap bounds.
Overall, our findings contribute to a deeper understanding of the mechanisms underlying neural network generalization.
arXiv Detail & Related papers (2023-06-28T04:46:59Z) - Learning to Scaffold: Optimizing Model Explanations for Teaching [74.25464914078826]
We train models on three natural language processing and computer vision tasks.
We find that students trained with explanations extracted with our framework are able to simulate the teacher significantly more effectively than ones produced with previous methods.
arXiv Detail & Related papers (2022-04-22T16:43:39Z) - Information Flow in Deep Neural Networks [0.6922389632860545]
There is no comprehensive theoretical understanding of how deep neural networks work or are structured.
Deep networks are often seen as black boxes with unclear interpretations and reliability.
This work aims to apply principles and techniques from information theory to deep learning models to increase our theoretical understanding and design better algorithms.
arXiv Detail & Related papers (2022-02-10T23:32:26Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Exploring Bayesian Deep Learning for Urgent Instructor Intervention Need
in MOOC Forums [58.221459787471254]
Massive Open Online Courses (MOOCs) have become a popular choice for e-learning thanks to their great flexibility.
Due to large numbers of learners and their diverse backgrounds, it is taxing to offer real-time support.
With the large volume of posts and high workloads for MOOC instructors, it is unlikely that the instructors can identify all learners requiring intervention.
This paper explores for the first time Bayesian deep learning on learner-based text posts with two methods: Monte Carlo Dropout and Variational Inference.
arXiv Detail & Related papers (2021-04-26T15:12:13Z) - Explaining Deep Neural Networks [12.100913944042972]
In various domains, such as healthcare, finance, or law, it is critical to know the reasons behind a decision made by an artificial intelligence system.
This thesis investigates two major directions for explaining deep neural networks.
arXiv Detail & Related papers (2020-10-04T07:23:13Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z) - Exploiting Contextual Information with Deep Neural Networks [5.787117733071416]
We show that contextual information can be exploited in 2 fundamentally different ways: implicitly and explicitly.
In this thesis, we show that contextual information can be exploited in 2 fundamentally different ways: implicitly and explicitly.
arXiv Detail & Related papers (2020-06-21T03:40:30Z) - The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics.
We find good agreement between our model's predictions and training dynamics in realistic deep learning settings.
We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.