Combinatory Adjoints and Differentiation
        - URL: http://arxiv.org/abs/2207.00847v1
- Date: Sat, 2 Jul 2022 14:34:54 GMT
- Title: Combinatory Adjoints and Differentiation
- Authors: Martin Elsman (University of Copenhagen), Fritz Henglein (University
  of Copenhagen), Robin Kaarsgaard (University of Edinburgh), Mikkel Kragh
  Mathiesen (University of Copenhagen), Robert Schenck (University of
  Copenhagen)
- Abstract summary: We develop a compositional approach for automatic and symbolic differentiation based on categorical constructions in functional analysis.
We show that both symbolic and automatic differentiation can be performed using a differential calculus for generating linear functions.
We also provide a calculus for symbolically computing the adjoint of a derivative without using matrices.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   We develop a compositional approach for automatic and symbolic
differentiation based on categorical constructions in functional analysis where
derivatives are linear functions on abstract vectors rather than being limited
to scalars, vectors, matrices or tensors represented as multi-dimensional
arrays. We show that both symbolic and automatic differentiation can be
performed using a differential calculus for generating linear functions
representing Fr\'echet derivatives based on rules for primitive, constant,
linear and bilinear functions as well as their sequential and parallel
composition. Linear functions are represented in a combinatory domain-specific
language. Finally, we provide a calculus for symbolically computing the adjoint
of a derivative without using matrices, which are too inefficient to use on
high-dimensional spaces. The resulting symbolic representation of a derivative
retains the data-parallel operations from the input program. The combination of
combinatory differentiation and computing formal adjoints turns out to be
behaviorally equivalent to reverse-mode automatic differentiation. In
particular, it provides opportunities for optimizations where matrices are too
inefficient to represent linear functions.
 
      
        Related papers
        - Discrete Differential Principle for Continuous Smooth Function   Representation [5.897186764108586]
 Taylor's formula suffers from the curse of dimensionality and error propagation during derivative computation in discrete situations.<n>We propose a new discrete differential operator to estimate derivatives and to represent continuous smooth function locally.<n>Our technique offers broad applicability across domains such as vision representation, feature extraction, fluid mechanics, and cross-media imaging.
 arXiv  Detail & Related papers  (2025-07-13T03:43:23Z)
- Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in   Linear Models [76.52307406752556]
 We derive a novel deterministic equivalence for the two-point function of a random resolvent.
We give a unified derivation of the performance of a wide variety of high-dimensional trained linear models with gradient descent.
 arXiv  Detail & Related papers  (2025-02-07T16:45:40Z)
- Matrix Calculus (for Machine Learning and Beyond) [0.2647285178819813]
 This course introduces the extension of differential calculus to functions on more general vector spaces.
It emphasizes practical computational applications, such as large-scale optimization and machine learning.
 arXiv  Detail & Related papers  (2025-01-07T18:38:35Z)
- Entrywise application of non-linear functions on orthogonally invariant   matrices [44.99833362998488]
 We investigate how the entrywise application of a non-linear function to symmetric invariant random matrix ensembles alters the spectral distribution.
We find that in all those cases a Gaussian equivalence principle holds, that is, the effect of the non-linear function is the same as taking a linear combination of the involved matrices and an additional independent GOE.
 arXiv  Detail & Related papers  (2024-12-09T19:41:09Z)
- Understanding Matrix Function Normalizations in Covariance Pooling   through the Lens of Riemannian Geometry [63.694184882697435]
 Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations.
 arXiv  Detail & Related papers  (2024-07-15T07:11:44Z)
- Optimal Matrix-Mimetic Tensor Algebras via Variable Projection [0.0]
 Matrix mimeticity arises from interpreting tensors as operators that can be multiplied, factorized, and analyzed analogous to matrices.
We learn optimal linear mappings and corresponding tensor representations without relying on prior knowledge of the data.
We provide original theory of uniqueness of the transformation and convergence analysis of our variable-projection-based algorithm.
 arXiv  Detail & Related papers  (2024-06-11T04:52:23Z)
- CoLA: Exploiting Compositional Structure for Automatic and Efficient
  Numerical Linear Algebra [62.37017125812101]
 We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA.
By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms.
We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
 arXiv  Detail & Related papers  (2023-09-06T14:59:38Z)
- The Backpropagation algorithm for a math student [0.0]
 A Deep Neural Network (DNN) is a composite function of vector-valued functions.
The gradient of the loss function of a DNN is a composition of several nonlinear functions, each with numerous parameters.
The objective of this paper is to express the gradient of the loss function in terms of a matrix multiplication using the Jacobian operator.
 arXiv  Detail & Related papers  (2023-01-22T08:45:30Z)
- Decoupling multivariate functions using a nonparametric filtered tensor
  decomposition [0.29360071145551075]
 Decoupling techniques aim at providing an alternative representation of the nonlinearity.
The so-called decoupled form is often a more efficient parameterisation of the relationship while being highly structured, favouring interpretability.
In this work two new algorithms, based on filtered tensor decompositions of first order derivative information are introduced.
 arXiv  Detail & Related papers  (2022-05-23T09:34:17Z)
- Learning Linearized Assignment Flows for Image Labeling [70.540936204654]
 We introduce a novel algorithm for estimating optimal parameters of linearized assignment flows for image labeling.
We show how to efficiently evaluate this formula using a Krylov subspace and a low-rank approximation.
 arXiv  Detail & Related papers  (2021-08-02T13:38:09Z)
- Automatic differentiation for Riemannian optimization on low-rank matrix
  and tensor-train manifolds [71.94111815357064]
 In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
 arXiv  Detail & Related papers  (2021-03-27T19:56:00Z)
- A graphical calculus for integration over random diagonal unitary
  matrices [1.5229257192293197]
 We provide a graphical calculus for computing averages of tensor network diagrams.
We exploit the order structure of the partially ordered set of uniform block permutations.
A similar calculus is developed for random vectors consisting of independent uniform signs.
 arXiv  Detail & Related papers  (2020-07-22T06:06:51Z)
- Automatic Differentiation in ROOT [62.997667081978825]
 In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program.
This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions.
 arXiv  Detail & Related papers  (2020-04-09T09:18:50Z)
- Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
 We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier.
A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
 arXiv  Detail & Related papers  (2019-06-05T07:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.