Mathematical theory of deep learning
- URL: http://arxiv.org/abs/2407.18384v2
- Date: Fri, 11 Oct 2024 12:16:53 GMT
- Title: Mathematical theory of deep learning
- Authors: Philipp Petersen, Jakob Zech,
- Abstract summary: It covers fundamental results in approximation theory, optimization theory, and statistical learning theory.
The book aims to equip readers with foundational knowledge on the topic.
- Score: 0.46040036610482665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This book provides an introduction to the mathematical analysis of deep learning. It covers fundamental results in approximation theory, optimization theory, and statistical learning theory, which are the three main pillars of deep neural network theory. Serving as a guide for students and researchers in mathematics and related fields, the book aims to equip readers with foundational knowledge on the topic. It prioritizes simplicity over generality, and presents rigorous yet accessible results to help build an understanding of the essential mathematical concepts underpinning deep learning.
Related papers
- Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning [14.909298522361306]
Notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning.
They present a theory (developed by NC, NR and collaborators) of linear neural networks -- a fundamental model in the study of optimization and generalization in deep learning.
arXiv Detail & Related papers (2024-08-25T08:24:48Z) - Foundations and Frontiers of Graph Learning Theory [81.39078977407719]
Recent advancements in graph learning have revolutionized the way to understand and analyze data with complex structures.
Graph Neural Networks (GNNs), i.e. neural network architectures designed for learning graph representations, have become a popular paradigm.
This article provides a comprehensive summary of the theoretical foundations and breakthroughs concerning the approximation and learning behaviors intrinsic to prevalent graph learning models.
arXiv Detail & Related papers (2024-07-03T14:07:41Z) - A Geometric Framework for Adversarial Vulnerability in Machine Learning [0.0]
This work starts with the intention of using mathematics to understand the intriguing vulnerability observed by citetszegedy2013 within artificial neural networks.
Along the way, we will develop some novel tools with applications far outside of just the adversarial domain.
arXiv Detail & Related papers (2024-07-03T11:01:15Z) - Towards a Holistic Understanding of Mathematical Questions with
Contrastive Pre-training [65.10741459705739]
We propose a novel contrastive pre-training approach for mathematical question representations, namely QuesCo.
We first design two-level question augmentations, including content-level and structure-level, which generate literally diverse question pairs with similar purposes.
Then, to fully exploit hierarchical information of knowledge concepts, we propose a knowledge hierarchy-aware rank strategy.
arXiv Detail & Related papers (2023-01-18T14:23:29Z) - Deep Learning and Computational Physics (Lecture Notes) [0.5156484100374059]
Notes should be accessible to a typical engineering graduate student with a strong background in Applied Mathematics.
Use concepts from computational physics to develop an understanding of deep learning algorithms.
Several novel deep learning algorithms can be used to solve challenging problems in computational physics.
arXiv Detail & Related papers (2023-01-03T03:56:19Z) - A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade.
Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z) - Envisioning Future Deep Learning Theories: Some Basic Concepts and Characteristics [30.365274034429508]
We argue that a future deep learning theory should inherit three characteristics: a textitarchhierically structured network architecture, parameters textititeratively optimized using gradient-based methods, and information from the data that evolves textitcompressively
We integrate these characteristics into a graphical model called textitneurashed, which effectively explains some common empirical patterns in deep learning.
arXiv Detail & Related papers (2021-12-17T19:51:26Z) - Formalising Concepts as Grounded Abstractions [68.24080871981869]
This report shows how representation learning can be used to induce concepts from raw data.
The main technical goal of this report is to show how techniques from representation learning can be married with a lattice-theoretic formulation of conceptual spaces.
arXiv Detail & Related papers (2021-01-13T15:22:01Z) - Recent advances in deep learning theory [104.01582662336256]
This paper reviews and organizes the recent advances in deep learning theory.
The literature is categorized in six groups: (1) complexity and capacity-based approaches for analysing the generalizability of deep learning; (2) differential equations and their dynamic systems for modelling gradient descent and its variants; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; and (5) theoretical foundations of several special structures in network architectures.
arXiv Detail & Related papers (2020-12-20T14:16:41Z) - Deep Learning is Singular, and That's Good [31.985399645173022]
In singular models, the optimal set of parameters forms an analytic set with singularities and classical statistical inference cannot be applied.
This is significant for deep learning as neural networks are singular and thus "dividing" by the determinant of the Hessian or employing the Laplace approximation are not appropriate.
Despite its potential for addressing fundamental issues in deep learning, singular learning theory appears to have made little inroads into the developing canon of deep learning theory.
arXiv Detail & Related papers (2020-10-22T09:33:59Z) - Optimism in the Face of Adversity: Understanding and Improving Deep
Learning through Adversarial Robustness [63.627760598441796]
We provide an in-depth review of the field of adversarial robustness in deep learning.
We highlight the intuitive connection between adversarial examples and the geometry of deep neural networks.
We provide an overview of the main emerging applications of adversarial robustness beyond security.
arXiv Detail & Related papers (2020-10-19T16:03:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.