Related papers: Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits

Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits

URL: http://arxiv.org/abs/2101.10488v1
Date: Tue, 26 Jan 2021 00:07:20 GMT
Title: Reverse Derivative Ascent: A Categorical Approach to Learning Boolean Circuits
Authors: Paul Wilson (University of Southampton), Fabio Zanasi (University College London)
Abstract summary: We introduce Reverse Derivative Ascent: a categorical analogue of gradient based methods for machine learning. Our motivating example is reverse circuits: we show how our algorithm can be applied to such circuits by using the theory of reverse differential categories. We demonstrate its empirical value by giving experimental results on benchmark machine learning datasets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce Reverse Derivative Ascent: a categorical analogue of gradient based methods for machine learning. Our algorithm is defined at the level of so-called reverse differential categories. It can be used to learn the parameters of models which are expressed as morphisms of such categories. Our motivating example is boolean circuits: we show how our algorithm can be applied to such circuits by using the theory of reverse differential categories. Note our methodology allows us to learn the parameters of boolean circuits directly, in contrast to existing binarised neural network approaches. Moreover, we demonstrate its empirical value by giving experimental results on benchmark machine learning datasets.

Related papers

Modelling Arbitrary Computations in the Symbolic Model using an Equational Theory for Bounded Binary Circuits [0.0]
We propose a class of equational theories for bounded binary circuits with the finite variant property.<n>These theories could serve as a building block to specify cryptographic primitive implementations.
arXiv Detail & Related papers (2025-07-29T12:09:50Z)
Learning on Transformers is Provable Low-Rank and Sparse: A One-layer Analysis [63.66763657191476]
We show that efficient numerical training and inference algorithms as low-rank computation have impressive performance for learning Transformer-based adaption. We analyze how magnitude-based models affect generalization while improving adaption. We conclude that proper magnitude-based has a slight on the testing performance.
arXiv Detail & Related papers (2024-06-24T23:00:58Z)
Deep Learning with Parametric Lenses [0.3645042846301408]
We propose a categorical semantics for machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework. We demonstrate the practical significance of our framework with an implementation in Python.
arXiv Detail & Related papers (2024-03-30T16:34:28Z)
Uncovering Intermediate Variables in Transformers using Circuit Probing [32.382094867951224]
We propose a new analysis technique -- circuit probing -- that automatically uncovers low-level circuits that compute hypothesized intermediate variables. We apply this method to models trained on simple arithmetic tasks, demonstrating its effectiveness at (1) deciphering the algorithms that models have learned, (2) revealing modular structure within a model, and (3) tracking the development of circuits over training.
arXiv Detail & Related papers (2023-11-07T21:27:17Z)
Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms. We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z)
What learning algorithm is in-context learning? Investigations with linear models [87.91612418166464]
We investigate the hypothesis that transformer-based in-context learners implement standard learning algorithms implicitly. We show that trained in-context learners closely match the predictors computed by gradient descent, ridge regression, and exact least-squares regression. Preliminary evidence that in-context learners share algorithmic features with these predictors.
arXiv Detail & Related papers (2022-11-28T18:59:51Z)
Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds. Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z)
Categories of Differentiable Polynomial Circuits for Machine Learning [0.76146285961466]
We study presentations by generators and equations of classes of RDCs. We propose emphpolynomial circuits as a suitable machine learning model.
arXiv Detail & Related papers (2022-03-12T13:03:30Z)
Fair Interpretable Representation Learning with Correction Vectors [60.0806628713968]
We propose a new framework for fair representation learning that is centered around the learning of "correction vectors" We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance.
arXiv Detail & Related papers (2022-02-07T11:19:23Z)
Training Neural Networks Using the Property of Negative Feedback to Inverse a Function [0.0]
This paper describes how the property of a negative feedback system to perform inverse of a function can be used for training neural networks. We have applied this method to the MNIST dataset and obtained results that shows the method is viable for neural network training.
arXiv Detail & Related papers (2021-03-25T20:13:53Z)
Categorical Foundations of Gradient-Based Learning [0.31498833540989407]
We propose a categorical foundation of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories. This framework provides a powerful explanatory and unifying framework, shedding new light on their similarities and differences. We also develop a novel implementation of gradient-based learning in Python, informed by the principles introduced by our framework.
arXiv Detail & Related papers (2021-03-02T18:43:10Z)
Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs [71.26657499537366]
We propose a simple literature-based method for the efficient approximation of gradients in neural ODE models. We compare it with the reverse dynamic method to train neural ODEs on classification, density estimation, and inference approximation tasks.
arXiv Detail & Related papers (2020-03-11T13:15:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.