Deep Learning with Parametric Lenses
- URL: http://arxiv.org/abs/2404.00408v1
- Date: Sat, 30 Mar 2024 16:34:28 GMT
- Title: Deep Learning with Parametric Lenses
- Authors: Geoffrey S. H. Cruttwell, Bruno Gavranovic, Neil Ghani, Paul Wilson, Fabio Zanasi,
- Abstract summary: We propose a categorical semantics for machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories.
This foundation provides a powerful explanatory and unifying framework.
We demonstrate the practical significance of our framework with an implementation in Python.
- Score: 0.3645042846301408
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a categorical semantics for machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as MSE and Softmax cross-entropy, and different architectures, shedding new light on their similarities and differences. Furthermore, our approach to learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realised in the discrete setting of Boolean and polynomial circuits. We demonstrate the practical significance of our framework with an implementation in Python.
Related papers
- Model-Based Deep Learning: On the Intersection of Deep Learning and
Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications.
Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular.
Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z) - Differentiable Spline Approximations [48.10988598845873]
Differentiable programming has significantly enhanced the scope of machine learning.
Standard differentiable programming methods (such as autodiff) typically require that the machine learning models be differentiable.
We show that leveraging this redesigned Jacobian in the form of a differentiable "layer" in predictive models leads to improved performance in diverse applications.
arXiv Detail & Related papers (2021-10-04T16:04:46Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm.
We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure.
This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z) - Categorical Foundations of Gradient-Based Learning [0.31498833540989407]
We propose a categorical foundation of gradient-based machine learning algorithms in terms of lenses, parametrised maps, and reverse derivative categories.
This framework provides a powerful explanatory and unifying framework, shedding new light on their similarities and differences.
We also develop a novel implementation of gradient-based learning in Python, informed by the principles introduced by our framework.
arXiv Detail & Related papers (2021-03-02T18:43:10Z) - Reverse Derivative Ascent: A Categorical Approach to Learning Boolean
Circuits [0.0]
We introduce Reverse Derivative Ascent: a categorical analogue of gradient based methods for machine learning.
Our motivating example is reverse circuits: we show how our algorithm can be applied to such circuits by using the theory of reverse differential categories.
We demonstrate its empirical value by giving experimental results on benchmark machine learning datasets.
arXiv Detail & Related papers (2021-01-26T00:07:20Z) - Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning
Dynamics [26.485269202381932]
Understanding the dynamics of neural network parameters during training is one of the key challenges in building a theoretical foundation for deep learning.
We show that any such symmetry imposes stringent geometric constraints on gradients and Hessians, leading to an associated conservation law.
We apply tools from finite difference methods to derive modified gradient flow, a differential equation that better approximates the numerical trajectory taken by SGD at finite learning rates.
arXiv Detail & Related papers (2020-12-08T20:33:30Z) - Functorial Manifold Learning [1.14219428942199]
We first characterize manifold learning algorithms as functors that map pseudometric spaces to optimization objectives.
We then use this characterization to prove refinement bounds on manifold learning loss functions and construct a hierarchy of manifold learning algorithms.
We express several popular manifold learning algorithms as functors at different levels of this hierarchy, including Metric Multidimensional Scaling, IsoMap, and UMAP.
arXiv Detail & Related papers (2020-11-15T02:30:23Z) - Reinforcement Learning as Iterative and Amortised Inference [62.997667081978825]
We use the control as inference framework to outline a novel classification scheme based on amortised and iterative inference.
We show that taking this perspective allows us to identify parts of the algorithmic design space which have been relatively unexplored.
arXiv Detail & Related papers (2020-06-13T16:10:03Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.