Matrix Calculus (for Machine Learning and Beyond)
- URL: http://arxiv.org/abs/2501.14787v1
- Date: Tue, 07 Jan 2025 18:38:35 GMT
- Title: Matrix Calculus (for Machine Learning and Beyond)
- Authors: Paige Bright, Alan Edelman, Steven G. Johnson,
- Abstract summary: This course introduces the extension of differential calculus to functions on more general vector spaces.
It emphasizes practical computational applications, such as large-scale optimization and machine learning.
- Score: 0.2647285178819813
- License:
- Abstract: This course, intended for undergraduates familiar with elementary calculus and linear algebra, introduces the extension of differential calculus to functions on more general vector spaces, such as functions that take as input a matrix and return a matrix inverse or factorization, derivatives of ODE solutions, and even stochastic derivatives of random functions. It emphasizes practical computational applications, such as large-scale optimization and machine learning, where derivatives must be re-imagined in order to be propagated through complicated calculations. The class also discusses efficiency concerns leading to "adjoint" or "reverse-mode" differentiation (a.k.a. "backpropagation"), and gives a gentle introduction to modern automatic differentiation (AD) techniques.
Related papers
- Spectral-factorized Positive-definite Curvature Learning for NN Training [39.296923519945814]
Training methods such as Adam(W) and Shampoo learn a positive-definite curvature matrix and apply an inverse root before preconditioning.
We propose a Riemannian optimization approach that dynamically adapts spectral-factorized positive-definite curvature estimates.
arXiv Detail & Related papers (2025-02-10T09:07:04Z) - Learning Linear Attention in Polynomial Time [115.68795790532289]
We provide the first results on learnability of single-layer Transformers with linear attention.
We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS.
We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z) - A Physics-Informed Machine Learning Approach for Solving Distributed Order Fractional Differential Equations [0.0]
This paper introduces a novel methodology for solving distributed-order fractional differential equations using a physics-informed machine learning framework.
By embedding the distributed-order functional equation into the SVR framework, we incorporate physical laws directly into the learning process.
The effectiveness of the proposed approach is validated through a series of numerical experiments on Caputo-based distributed-order fractional differential equations.
arXiv Detail & Related papers (2024-09-05T13:20:10Z) - CoLA: Exploiting Compositional Structure for Automatic and Efficient
Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA.
By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms.
We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z) - Efficient and Sound Differentiable Programming in a Functional
Array-Processing Language [4.1779847272994495]
Automatic differentiation (AD) is a technique for computing the derivative of a function represented by a program.
We present an AD system for a higher-order functional array-processing language.
In combination, computation with forward-mode AD can be as efficient as reverse mode.
arXiv Detail & Related papers (2022-12-20T14:54:47Z) - Combinatory Adjoints and Differentiation [0.0]
We develop a compositional approach for automatic and symbolic differentiation based on categorical constructions in functional analysis.
We show that both symbolic and automatic differentiation can be performed using a differential calculus for generating linear functions.
We also provide a calculus for symbolically computing the adjoint of a derivative without using matrices.
arXiv Detail & Related papers (2022-07-02T14:34:54Z) - Efficient and Modular Implicit Differentiation [68.74748174316989]
We propose a unified, efficient and modular approach for implicit differentiation of optimization problems.
We show that seemingly simple principles allow to recover many recently proposed implicit differentiation methods and create new ones easily.
arXiv Detail & Related papers (2021-05-31T17:45:58Z) - Automatic differentiation for Riemannian optimization on low-rank matrix
and tensor-train manifolds [71.94111815357064]
In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions.
One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
arXiv Detail & Related papers (2021-03-27T19:56:00Z) - Efficient Learning of Generative Models via Finite-Difference Score
Matching [111.55998083406134]
We present a generic strategy to efficiently approximate any-order directional derivative with finite difference.
Our approximation only involves function evaluations, which can be executed in parallel, and no gradient computations.
arXiv Detail & Related papers (2020-07-07T10:05:01Z) - Automatic Differentiation in ROOT [62.997667081978825]
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program.
This paper presents AD techniques available in ROOT, supported by Cling, to produce derivatives of arbitrary C/C++ functions.
arXiv Detail & Related papers (2020-04-09T09:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.