Combinatory Adjoints and Differentiation
- URL: http://arxiv.org/abs/2207.00847v1
- Date: Sat, 2 Jul 2022 14:34:54 GMT
- Title: Combinatory Adjoints and Differentiation
- Authors: Martin Elsman (University of Copenhagen), Fritz Henglein (University
of Copenhagen), Robin Kaarsgaard (University of Edinburgh), Mikkel Kragh
Mathiesen (University of Copenhagen), Robert Schenck (University of
Copenhagen)
- Abstract summary: We develop a compositional approach for automatic and symbolic differentiation based on categorical constructions in functional analysis.
We show that both symbolic and automatic differentiation can be performed using a differential calculus for generating linear functions.
We also provide a calculus for symbolically computing the adjoint of a derivative without using matrices.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a compositional approach for automatic and symbolic
differentiation based on categorical constructions in functional analysis where
derivatives are linear functions on abstract vectors rather than being limited
to scalars, vectors, matrices or tensors represented as multi-dimensional
arrays. We show that both symbolic and automatic differentiation can be
performed using a differential calculus for generating linear functions
representing Fr\'echet derivatives based on rules for primitive, constant,
linear and bilinear functions as well as their sequential and parallel
composition. Linear functions are represented in a combinatory domain-specific
language. Finally, we provide a calculus for symbolically computing the adjoint
of a derivative without using matrices, which are too inefficient to use on
high-dimensional spaces. The resulting symbolic representation of a derivative
retains the data-parallel operations from the input program. The combination of
combinatory differentiation and computing formal adjoints turns out to be
behaviorally equivalent to reverse-mode automatic differentiation. In
particular, it provides opportunities for optimizations where matrices are too
inefficient to represent linear functions.
Related papers
- Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in Linear Models [76.52307406752556]
We derive a novel deterministic equivalence for the two-point function of a random resolvent.
We give a unified derivation of the performance of a wide variety of high-dimensional trained linear models with gradient descent.
arXiv Detail & Related papers (2025-02-07T16:45:40Z) - Matrix Calculus (for Machine Learning and Beyond) [0.2647285178819813]
This course introduces the extension of differential calculus to functions on more general vector spaces.
It emphasizes practical computational applications, such as large-scale optimization and machine learning.
arXiv Detail & Related papers (2025-01-07T18:38:35Z) - Entrywise application of non-linear functions on orthogonally invariant matrices [44.99833362998488]
We investigate how the entrywise application of a non-linear function to symmetric invariant random matrix ensembles alters the spectral distribution.
We find that in all those cases a Gaussian equivalence principle holds, that is, the effect of the non-linear function is the same as taking a linear combination of the involved matrices and an additional independent GOE.
arXiv Detail & Related papers (2024-12-09T19:41:09Z) - Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry [63.694184882697435]
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
This paper provides a comprehensive and unified understanding of the matrix logarithm and power from a Riemannian geometry perspective.
arXiv Detail & Related papers (2024-07-15T07:11:44Z) - The Backpropagation algorithm for a math student [0.0]
A Deep Neural Network (DNN) is a composite function of vector-valued functions.
The gradient of the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters.
The objective of this paper is to express the gradient of the loss function in terms of a matrix multiplication using the Jacobian operator.
arXiv Detail & Related papers (2023-01-22T08:45:30Z) - Decoupling multivariate functions using a nonparametric filtered tensor
decomposition [0.29360071145551075]
Decoupling techniques aim at providing an alternative representation of the nonlinearity.
The so-called decoupled form is often a more efficient parameterisation of the relationship while being highly structured, favouring interpretability.
In this work two new algorithms, based on filtered tensor decompositions of first order derivative information are introduced.
arXiv Detail & Related papers (2022-05-23T09:34:17Z) - Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds [71.94111815357064]
In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
arXiv Detail & Related papers (2021-03-27T19:56:00Z) - A graphical calculus for integration over random diagonal unitary
matrices [1.5229257192293197]
We provide a graphical calculus for computing averages of tensor network diagrams.
We exploit the order structure of the partially ordered set of uniform block permutations.
A similar calculus is developed for random vectors consisting of independent uniform signs.
arXiv Detail & Related papers (2020-07-22T06:06:51Z) - Automatic Differentiation in ROOT [62.997667081978825]
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program.
This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions.
arXiv Detail & Related papers (2020-04-09T09:18:50Z) - Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier.
A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.