Categorical Foundations of Gradient-Based Learning
- URL: http://arxiv.org/abs/2103.01931v1
- Date: Tue, 2 Mar 2021 18:43:10 GMT
- Title: Categorical Foundations of Gradient-Based Learning
- Authors: G.S.H. Cruttwell, Bruno Gavranovi\'c, Neil Ghani, Paul Wilson, Fabio
Zanasi
- Abstract summary: We propose a categorical foundation of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories.
This framework provides a powerful explanatory and unifying framework, shedding new light on their similarities and differences.
We also develop a novel implementation of gradient-based learning in Python, informed by the principles introduced by our framework.
- Score: 0.31498833540989407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a categorical foundation of gradient-based machine learning
algorithms in terms of lenses, parametrised maps, and reverse derivative
categories. This foundation provides a powerful explanatory and unifying
framework: it encompasses a variety of gradient descent algorithms such as
ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions
such as as MSE and Softmax cross-entropy, shedding new light on their
similarities and differences. Our approach also generalises beyond neural
networks (modelled in categories of smooth maps), accounting for other
structures relevant to gradient-based learning such as boolean circuits.
Finally, we also develop a novel implementation of gradient-based learning in
Python, informed by the principles introduced by our framework.
Related papers
- Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later [76.66498833720411]
We introduce a differentiable version of $K$-nearest neighbors (KNN) originally designed to learn a linear projection to capture semantic similarities between instances.
Surprisingly, our implementation of NCA using SGD and without dimensionality reduction already achieves decent performance on tabular data.
We conclude our paper by analyzing the factors behind these improvements, including loss functions, prediction strategies, and deep architectures.
arXiv Detail & Related papers (2024-07-03T16:38:57Z) - Deep Learning with Parametric Lenses [0.3645042846301408]
We propose a categorical semantics for machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories.
This foundation provides a powerful explanatory and unifying framework.
We demonstrate the practical significance of our framework with an implementation in Python.
arXiv Detail & Related papers (2024-03-30T16:34:28Z) - Fundamental Components of Deep Learning: A category-theoretic approach [0.0]
This thesis develops a novel mathematical foundation for deep learning based on the language of category theory.
We also systematise many existing approaches, placing many existing constructions and concepts under the same umbrella.
arXiv Detail & Related papers (2024-03-13T01:29:40Z) - Training morphological neural networks with gradient descent: some theoretical insights [0.40792653193642503]
We investigate the potential and limitations of differentiation based approaches and back-propagation applied to morphological networks.
We provide insights and first theoretical guidelines, in particular regarding learning rates.
arXiv Detail & Related papers (2024-02-05T12:11:15Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces.
It takes as input a concatenation of weights and biases of a pre-trainedvariant.
We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Reverse Derivative Ascent: A Categorical Approach to Learning Boolean
Circuits [0.0]
We introduce Reverse Derivative Ascent: a categorical analogue of gradient based methods for machine learning.
Our motivating example is reverse circuits: we show how our algorithm can be applied to such circuits by using the theory of reverse differential categories.
We demonstrate its empirical value by giving experimental results on benchmark machine learning datasets.
arXiv Detail & Related papers (2021-01-26T00:07:20Z) - Functorial Manifold Learning [1.14219428942199]
We first characterize manifold learning algorithms as functors that map pseudometric spaces to optimization objectives.
We then use this characterization to prove refinement bounds on manifold learning loss functions and construct a hierarchy of manifold learning algorithms.
We express several popular manifold learning algorithms as functors at different levels of this hierarchy, including Metric Multidimensional Scaling, IsoMap, and UMAP.
arXiv Detail & Related papers (2020-11-15T02:30:23Z) - Reinforcement Learning as Iterative and Amortised Inference [62.997667081978825]
We use the control as inference framework to outline a novel classification scheme based on amortised and iterative inference.
We show that taking this perspective allows us to identify parts of the algorithmic design space which have been relatively unexplored.
arXiv Detail & Related papers (2020-06-13T16:10:03Z) - FLAT: Few-Shot Learning via Autoencoding Transformation Regularizers [67.46036826589467]
We present a novel regularization mechanism by learning the change of feature representations induced by a distribution of transformations without using the labels of data examples.
It could minimize the risk of overfitting into base categories by inspecting the transformation-augmented variations at the encoded feature level.
Experiment results show the superior performances to the current state-of-the-art methods in literature.
arXiv Detail & Related papers (2019-12-29T15:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.