Mathematical Introduction to Deep Learning: Methods, Implementations,
and Theory
- URL: http://arxiv.org/abs/2310.20360v1
- Date: Tue, 31 Oct 2023 11:01:23 GMT
- Title: Mathematical Introduction to Deep Learning: Methods, Implementations,
and Theory
- Authors: Arnulf Jentzen, Benno Kuckuck, Philippe von Wurstemberger
- Abstract summary: This book aims to provide an introduction to the topic of deep learning algorithms.
We review essential components of deep learning algorithms in full mathematical detail.
- Score: 4.066869900592636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This book aims to provide an introduction to the topic of deep learning
algorithms. We review essential components of deep learning algorithms in full
mathematical detail including different artificial neural network (ANN)
architectures (such as fully-connected feedforward ANNs, convolutional ANNs,
recurrent ANNs, residual ANNs, and ANNs with batch normalization) and different
optimization algorithms (such as the basic stochastic gradient descent (SGD)
method, accelerated methods, and adaptive methods). We also cover several
theoretical aspects of deep learning algorithms such as approximation
capacities of ANNs (including a calculus for ANNs), optimization theory
(including Kurdyka-{\L}ojasiewicz inequalities), and generalization errors. In
the last part of the book some deep learning approximation methods for PDEs are
reviewed including physics-informed neural networks (PINNs) and deep Galerkin
methods. We hope that this book will be useful for students and scientists who
do not yet have any background in deep learning at all and would like to gain a
solid foundation as well as for practitioners who would like to obtain a firmer
mathematical understanding of the objects and methods considered in deep
learning.
Related papers
- Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning [14.909298522361306]
Notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning.
They present a theory (developed by NC, NR and collaborators) of linear neural networks -- a fundamental model in the study of optimization and generalization in deep learning.
arXiv Detail & Related papers (2024-08-25T08:24:48Z) - An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning [5.75055574132362]
approximation of solutions of partial differential equations with numerical algorithms is a central topic in applied mathematics.
One class of methods which has received a lot of attention in recent years are machine learning-based methods.
This article aims to provide an introduction to some of these methods and the mathematical theory on which they are based.
arXiv Detail & Related papers (2024-08-23T16:57:34Z) - Artificial Neural Network and Deep Learning: Fundamentals and Theory [0.0]
This book lays a solid groundwork for understanding data and probability distributions.
The book delves into multilayer feed-forward neural networks, explaining their architecture, training processes, and the backpropagation algorithm.
The text covers various learning rate schedules and adaptive algorithms, providing strategies to optimize the training process.
arXiv Detail & Related papers (2024-08-12T21:06:59Z) - Deep Learning and Geometric Deep Learning: an introduction for
mathematicians and physicists [0.0]
We discuss the inner functioning of the new and successfull algorithms of Deep Learning and Geometric Deep Learning.
We go over the key ingredients for these algorithms: the score and loss function and we explain the main steps for the training of a model.
We provide some appendices to complement our treatment discussing Kullback-Leibler divergence, regression, Multi-layer Perceptrons and the Universal Approximation Theorem.
arXiv Detail & Related papers (2023-05-09T16:50:36Z) - Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations [5.052293146674794]
We propose a new deep learning approach to approximate operators related to partial parametric differential equations (PDEs)
In the proposed approach we combine efficient classical numerical approximation techniques with deep operator learning methodologies.
We numerically test the proposed ADANN methodology in the case of several parametric PDEs.
arXiv Detail & Related papers (2023-02-07T06:39:20Z) - Bayesian Learning for Neural Networks: an algorithmic survey [95.42181254494287]
This self-contained survey engages and introduces readers to the principles and algorithms of Bayesian Learning for Neural Networks.
It provides an introduction to the topic from an accessible, practical-algorithmic perspective.
arXiv Detail & Related papers (2022-11-21T21:36:58Z) - Neuro-Symbolic Learning of Answer Set Programs from Raw Data [54.56905063752427]
Neuro-Symbolic AI aims to combine interpretability of symbolic techniques with the ability of deep learning to learn from raw data.
We introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a general neural network to extract latent concepts from raw data.
NSIL learns expressive knowledge, solves computationally complex problems, and achieves state-of-the-art performance in terms of accuracy and data efficiency.
arXiv Detail & Related papers (2022-05-25T12:41:59Z) - Neural Combinatorial Optimization: a New Player in the Field [69.23334811890919]
This paper presents a critical analysis on the incorporation of algorithms based on neural networks into the classical optimization framework.
A comprehensive study is carried out to analyse the fundamental aspects of such algorithms, including performance, transferability, computational cost and to larger-sized instances.
arXiv Detail & Related papers (2022-05-03T07:54:56Z) - Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges [50.22269760171131]
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods.
This text is concerned with exposing pre-defined regularities through unified geometric principles.
It provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers.
arXiv Detail & Related papers (2021-04-27T21:09:51Z) - Fusing the Old with the New: Learning Relative Camera Pose with
Geometry-Guided Uncertainty [91.0564497403256]
We present a novel framework that involves probabilistic fusion between the two families of predictions during network training.
Our network features a self-attention graph neural network, which drives the learning by enforcing strong interactions between different correspondences.
We propose motion parmeterizations suitable for learning and show that our method achieves state-of-the-art performance on the challenging DeMoN and ScanNet datasets.
arXiv Detail & Related papers (2021-04-16T17:59:06Z) - Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.