The Modern Mathematics of Deep Learning
- URL: http://arxiv.org/abs/2105.04026v1
- Date: Sun, 9 May 2021 21:30:42 GMT
- Title: The Modern Mathematics of Deep Learning
- Authors: Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen
- Abstract summary: We describe the new field of mathematical analysis of deep learning.
This field emerged around a list of research questions that were not answered within the classical of learning theory.
For selected approaches, we describe the main ideas in more detail.
- Score: 8.939008609565368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We describe the new field of mathematical analysis of deep learning. This
field emerged around a list of research questions that were not answered within
the classical framework of learning theory. These questions concern: the
outstanding generalization power of overparametrized neural networks, the role
of depth in deep architectures, the apparent absence of the curse of
dimensionality, the surprisingly successful optimization performance despite
the non-convexity of the problem, understanding what features are learned, why
deep architectures perform exceptionally well in physical problems, and which
fine aspects of an architecture affect the behavior of a learning task in which
way. We present an overview of modern approaches that yield partial answers to
these questions. For selected approaches, we describe the main ideas in more
detail.
Related papers
- Coding for Intelligence from the Perspective of Category [66.14012258680992]
Coding targets compressing and reconstructing data, and intelligence.
Recent trends demonstrate the potential homogeneity of these two fields.
We propose a novel problem of Coding for Intelligence from the category theory view.
arXiv Detail & Related papers (2024-07-01T07:05:44Z) - The Neural Race Reduction: Dynamics of Abstraction in Gated Networks [12.130628846129973]
We introduce the Gated Deep Linear Network framework that schematizes how pathways of information flow impact learning dynamics.
We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning.
Our work gives rise to general hypotheses relating neural architecture to learning and provides a mathematical approach towards understanding the design of more complex architectures.
arXiv Detail & Related papers (2022-07-21T12:01:03Z) - Theoretical Perspectives on Deep Learning Methods in Inverse Problems [115.93934028666845]
We focus on generative priors, untrained neural network priors, and unfolding algorithms.
In addition to summarizing existing results in these topics, we highlight several ongoing challenges and open problems.
arXiv Detail & Related papers (2022-06-29T02:37:50Z) - Heuristic Search Planning with Deep Neural Networks using Imitation,
Attention and Curriculum Learning [1.0323063834827413]
This paper presents a network model to learn a capable of relating relating to distant parts of the state space via optimal plan imitation.
To counter the limitation of the method in the creation of problems of increasing difficulty, we demonstrate the use of curriculum learning, where newly solved problem instances are added to the training set.
arXiv Detail & Related papers (2021-12-03T14:01:16Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Recent advances in deep learning theory [104.01582662336256]
This paper reviews and organizes the recent advances in deep learning theory.
The literature is categorized in six groups: (1) complexity and capacity-based approaches for analysing the generalizability of deep learning; (2) differential equations and their dynamic systems for modelling gradient descent and its variants; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; and (5) theoretical foundations of several special structures in network architectures.
arXiv Detail & Related papers (2020-12-20T14:16:41Z) - Understanding Deep Architectures with Reasoning Layer [60.90906477693774]
We show that properties of the algorithm layers, such as convergence, stability, and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model.
Our theory can provide useful guidelines for designing deep architectures with reasoning layers.
arXiv Detail & Related papers (2020-06-24T00:26:35Z) - Structure preserving deep learning [1.2263454117570958]
deep learning has risen to the foreground as a topic of massive interest.
There are multiple challenging mathematical problems involved in applying deep learning.
A growing effort to mathematically understand the structure in existing deep learning methods.
arXiv Detail & Related papers (2020-06-05T10:59:09Z) - Generalization in Deep Learning [103.91623583928852]
This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima.
We also discuss approaches to provide non-vacuous generalization guarantees for deep learning.
arXiv Detail & Related papers (2017-10-16T02:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.