Related papers: Towards a Categorical Foundation of Deep Learning: A Survey

Towards a Categorical Foundation of Deep Learning: A Survey

URL: http://arxiv.org/abs/2410.05353v2
Date: Mon, 14 Oct 2024 18:35:08 GMT
Title: Towards a Categorical Foundation of Deep Learning: A Survey
Authors: Francesco Riccardo Crescenzi,
Abstract summary: This thesis is a survey that covers some recent work attempting to study machine learning categorically. acting as a lingua franca of mathematics and science, category theory might be able to give a unifying structure to the field of machine learning.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The unprecedented pace of machine learning research has lead to incredible advances, but also poses hard challenges. At present, the field lacks strong theoretical underpinnings, and many important achievements stem from ad hoc design choices which are hard to justify in principle and whose effectiveness often goes unexplained. Research debt is increasing and many papers are found not to be reproducible. This thesis is a survey that covers some recent work attempting to study machine learning categorically. Category theory is a branch of abstract mathematics that has found successful applications in many fields, both inside and outside mathematics. Acting as a lingua franca of mathematics and science, category theory might be able to give a unifying structure to the field of machine learning. This could solve some of the aforementioned problems. In this work, we mainly focus on the application of category theory to deep learning. Namely, we discuss the use of categorical optics to model gradient-based learning, the use of categorical algebras and integral transforms to link classical computer science to neural networks, the use of functors to link different layers of abstraction and preserve structure, and, finally, the use of string diagrams to provide detailed representations of neural network architectures.

Related papers

Machine Learning: a Lecture Note [51.31735291774885]
This lecture note is intended to prepare early-year master's and PhD students in data science or a related discipline with foundational ideas in machine learning.<n>It starts with basic ideas in modern machine learning with classification as a main target task.<n>Based on these basic ideas, the lecture note explores in depth the probablistic approach to unsupervised learning.
arXiv Detail & Related papers (2025-05-06T16:03:41Z)
Sheaf theory: from deep geometry to deep learning [0.3749861135832073]
This paper provides an overview of the applications of sheaf theory in deep learning, data science, and computer science. We describe intuitions and motivations underlying sheaf theory shared by both theoretical researchers and practitioners. We present a new algorithm to compute sheaf cohomology on arbitrary finite posets in response.
arXiv Detail & Related papers (2025-02-21T14:00:25Z)
Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Ontologies are widely used for representing domain knowledge and meta data. One straightforward solution is to integrate statistical analysis and machine learning. Numerous papers have been published on embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field.
arXiv Detail & Related papers (2024-06-16T14:49:19Z)
Fundamental Components of Deep Learning: A category-theoretic approach [0.0]
This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We also systematise many existing approaches, placing many existing constructions and concepts under the same umbrella.
arXiv Detail & Related papers (2024-03-13T01:29:40Z)
A Review of Neuroscience-Inspired Machine Learning [58.72729525961739]
Bio-plausible credit assignment is compatible with practically any learning condition and is energy-efficient. In this paper, we survey several vital algorithms that model bio-plausible rules of credit assignment in artificial neural networks. We conclude by discussing the future challenges that will need to be addressed in order to make such algorithms more useful in practical applications.
arXiv Detail & Related papers (2024-02-16T18:05:09Z)
Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations [1.9580473532948401]
This thesis explores the theoretical foundations of deep learning by studying the relationship between the architecture of these models and the inherent structures found within the data they process. We ask What drives the efficacy of deep learning algorithms and allows them to beat the so-called curse of dimensionality. Our methodology takes an empirical approach to deep learning, combining experimental studies with physics-inspired toy models.
arXiv Detail & Related papers (2023-10-24T19:50:41Z)
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure" We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z)
Computing with Categories in Machine Learning [1.7679374058425343]
We introduce DisCoPyro as a categorical structure learning framework. DisCoPyro combines categorical structures with amortized variational inference. We speculate that DisCoPyro could ultimately contribute to the development of artificial general intelligence.
arXiv Detail & Related papers (2023-03-07T17:26:18Z)
A Survey of Deep Learning for Mathematical Reasoning [71.88150173381153]
We review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade. Recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning.
arXiv Detail & Related papers (2022-12-20T18:46:16Z)
Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions [68.6358773622615]
This paper provides an overview of the computational and theoretical foundations of multimodal machine learning. We propose a taxonomy of 6 core technical challenges: representation, alignment, reasoning, generation, transference, and quantification. Recent technical achievements will be presented through the lens of this taxonomy, allowing researchers to understand the similarities and differences across new approaches.
arXiv Detail & Related papers (2022-09-07T19:21:19Z)
Category Theory in Machine Learning [1.6758573326215689]
We document the motivations, goals and common themes across applications of category theory in machine learning. We touch on gradient-based learning, probability, and equivariant learning.
arXiv Detail & Related papers (2021-06-13T15:58:13Z)
Ten Quick Tips for Deep Learning in Biology [116.78436313026478]
Machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling. Deep learning has become its own subfield of machine learning. In the context of biological research, deep learning has been increasingly used to derive novel insights from high-dimensional biological data.
arXiv Detail & Related papers (2021-05-29T21:02:44Z)
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges [50.22269760171131]
The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. This text is concerned with exposing pre-defined regularities through unified geometric principles. It provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers.
arXiv Detail & Related papers (2021-04-27T21:09:51Z)
Gradient-based Competitive Learning: Theory [1.6752712949948443]
This paper introduces a novel perspective in this area by combining gradient-based and competitive learning. The theory is based on the intuition that neural networks are able to learn topological structures by working directly on the transpose of the input matrix. The proposed approach has a great potential as it can be generalized to a vast selection of topological learning tasks.
arXiv Detail & Related papers (2020-09-06T19:00:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.