Related papers: An induction proof of the backpropagation algorithm in matrix notation

An induction proof of the backpropagation algorithm in matrix notation

URL: http://arxiv.org/abs/2107.09384v1
Date: Tue, 20 Jul 2021 10:02:17 GMT
Title: An induction proof of the backpropagation algorithm in matrix notation
Authors: Dirk Ostwald and Franziska Us\'ee
Abstract summary: Backpropagation (BP) is an algorithm that exploits the computational architecture of neural networks. This work provides a full induction proof of the BP algorithm in matrix notation.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Backpropagation (BP) is a core component of the contemporary deep learning incarnation of neural networks. Briefly, BP is an algorithm that exploits the computational architecture of neural networks to efficiently evaluate the gradient of a cost function during neural network parameter optimization. The validity of BP rests on the application of a multivariate chain rule to the computational architecture of neural networks and their associated objective functions. Introductions to deep learning theory commonly present the computational architecture of neural networks in matrix form, but eschew a parallel formulation and justification of BP in the framework of matrix differential calculus. This entails several drawbacks for the theory and didactics of deep learning. In this work, we overcome these limitations by providing a full induction proof of the BP algorithm in matrix notation. Specifically, we situate the BP algorithm in the framework of matrix differential calculus, encompass affine-linear potential functions, prove the validity of the BP algorithm in inductive form, and exemplify the implementation of the matrix form BP algorithm in computer code.

Related papers

Beyond Belief Propagation: Cluster-Corrected Tensor Network Contraction with Exponential Convergence [0.0]
We develop a rigorous theoretical framework for BP in tensor networks, leveraging insights from statistical mechanics.<n>We prove that the cluster expansion converges exponentially fast if an object called the emphloop contribution decays sufficiently fast with the loop size.<n>Our work opens the door to a systematic theory of BP for tensor networks and its applications in decoding classical and quantum error-correcting codes and simulating quantum systems.
arXiv Detail & Related papers (2025-10-02T17:58:30Z)
Front-propagation Algorithm: Explainable AI Technique for Extracting Linear Function Approximations from Neural Networks [0.0]
This paper introduces the front-propagation algorithm, a novel AI technique designed to elucidate the decision-making logic of deep neural networks. Unlike other popular explainability algorithms such as Integrated Gradients or Shapley Values, the proposed algorithm is able to extract an accurate and consistent linear function explanation of the network. We demonstrate its efficacy in providing accurate linear functions with three different neural network architectures trained on publicly available benchmark datasets.
arXiv Detail & Related papers (2024-05-25T14:50:23Z)
Deep Unrolling for Nonconvex Robust Principal Component Analysis [75.32013242448151]
We design algorithms for Robust Component Analysis (A) It consists in decomposing a matrix into the sum of a low Principaled matrix and a sparse Principaled matrix.
arXiv Detail & Related papers (2023-07-12T03:48:26Z)
When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning. Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z)
Scalable computation of prediction intervals for neural networks via matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure. This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z)
Deep learning via message passing algorithms based on belief propagation [2.931240348160871]
We present a family of BP-based message-passing algorithms with a reinforcement field that biases towards locally entropic distributions. These algorithms are capable of training multi-layer neural networks with discrete weights and activations with performance comparable to SGD-inspired solutions.
arXiv Detail & Related papers (2021-10-27T16:52:26Z)
Predictive Coding Can Do Exact Backpropagation on Convolutional and Recurrent Neural Networks [40.51949948934705]
Predictive coding networks (PCNs) are an influential model for information processing in the brain. BP is commonly regarded to be the most successful learning method in modern machine learning. We show that a biologically plausible algorithm is able to exactly replicate the accuracy of BP on complex architectures.
arXiv Detail & Related papers (2021-03-05T14:57:01Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
Predictive Coding Approximates Backprop along Arbitrary Computation Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents. Our models perform equivalently to backprop on challenging machine learning benchmarks. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.