Combinatory Adjoints and Differentiation
- URL: http://arxiv.org/abs/2207.00847v1
- Date: Sat, 2 Jul 2022 14:34:54 GMT
- Title: Combinatory Adjoints and Differentiation
- Authors: Martin Elsman (University of Copenhagen), Fritz Henglein (University
of Copenhagen), Robin Kaarsgaard (University of Edinburgh), Mikkel Kragh
Mathiesen (University of Copenhagen), Robert Schenck (University of
Copenhagen)
- Abstract summary: We develop a compositional approach for automatic and symbolic differentiation based on categorical constructions in functional analysis.
We show that both symbolic and automatic differentiation can be performed using a differential calculus for generating linear functions.
We also provide a calculus for symbolically computing the adjoint of a derivative without using matrices.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a compositional approach for automatic and symbolic
differentiation based on categorical constructions in functional analysis where
derivatives are linear functions on abstract vectors rather than being limited
to scalars, vectors, matrices or tensors represented as multi-dimensional
arrays. We show that both symbolic and automatic differentiation can be
performed using a differential calculus for generating linear functions
representing Fr\'echet derivatives based on rules for primitive, constant,
linear and bilinear functions as well as their sequential and parallel
composition. Linear functions are represented in a combinatory domain-specific
language. Finally, we provide a calculus for symbolically computing the adjoint
of a derivative without using matrices, which are too inefficient to use on
high-dimensional spaces. The resulting symbolic representation of a derivative
retains the data-parallel operations from the input program. The combination of
combinatory differentiation and computing formal adjoints turns out to be
behaviorally equivalent to reverse-mode automatic differentiation. In
particular, it provides opportunities for optimizations where matrices are too
inefficient to represent linear functions.
Related papers
- Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry [63.694184882697435]
Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
arXiv Detail & Related papers (2024-07-15T07:11:44Z) - Optimal Matrix-Mimetic Tensor Algebras via Variable Projection [0.0]
Matrix mimeticity arises from interpreting tensors as operators that can be multiplied, factorized, and analyzed analogous to matrices.
We learn optimal linear mappings and corresponding tensor representations without relying on prior knowledge of the data.
We provide original theory of uniqueness of the transformation and convergence analysis of our variable-projection-based algorithm.
arXiv Detail & Related papers (2024-06-11T04:52:23Z) - CoLA: Exploiting Compositional Structure for Automatic and Efficient
Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA.
By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms.
We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z) - The Backpropagation algorithm for a math student [0.0]
A Deep Neural Network (DNN) is a composite function of vector-valued functions.
The gradient of the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters.
The objective of this paper is to express the gradient of the loss function in terms of a matrix multiplication using the Jacobian operator.
arXiv Detail & Related papers (2023-01-22T08:45:30Z) - Decoupling multivariate functions using a nonparametric filtered tensor
decomposition [0.29360071145551075]
Decoupling techniques aim at providing an alternative representation of the nonlinearity.
The so-called decoupled form is often a more efficient parameterisation of the relationship while being highly structured, favouring interpretability.
In this work two new algorithms, based on filtered tensor decompositions of first order derivative information are introduced.
arXiv Detail & Related papers (2022-05-23T09:34:17Z) - Learning Linearized Assignment Flows for Image Labeling [70.540936204654]
We introduce a novel algorithm for estimating optimal parameters of linearized assignment flows for image labeling.
We show how to efficiently evaluate this formula using a Krylov subspace and a low-rank approximation.
arXiv Detail & Related papers (2021-08-02T13:38:09Z) - Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds [71.94111815357064]
In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
arXiv Detail & Related papers (2021-03-27T19:56:00Z) - A graphical calculus for integration over random diagonal unitary
matrices [1.5229257192293197]
We provide a graphical calculus for computing averages of tensor network diagrams.
We exploit the order structure of the partially ordered set of uniform block permutations.
A similar calculus is developed for random vectors consisting of independent uniform signs.
arXiv Detail & Related papers (2020-07-22T06:06:51Z) - Automatic Differentiation in ROOT [62.997667081978825]
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program.
This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions.
arXiv Detail & Related papers (2020-04-09T09:18:50Z) - Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier.
A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.