Related papers: Topos and Stacks of Deep Neural Networks

Topos and Stacks of Deep Neural Networks

URL: http://arxiv.org/abs/2106.14587v1
Date: Mon, 28 Jun 2021 11:50:06 GMT
Title: Topos and Stacks of Deep Neural Networks
Authors: Jean-Claude Belfiore and Daniel Bennequin
Abstract summary: Every known artificial deep neural network (DNN) corresponds to an object in a canonical Grothendieck's topos. Invariance structures in the layers (like CNNs or LSTMs) correspond to Giraud's stacks. Semantic functioning of a network is its ability to express theories in such a language for answering questions in output about input data.
Score: 12.300163392308807
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Every known artificial deep neural network (DNN) corresponds to an object in a canonical Grothendieck's topos; its learning dynamic corresponds to a flow of morphisms in this topos. Invariance structures in the layers (like CNNs or LSTMs) correspond to Giraud's stacks. This invariance is supposed to be responsible of the generalization property, that is extrapolation from learning data under constraints. The fibers represent pre-semantic categories (Culioli, Thom), over which artificial languages are defined, with internal logics, intuitionist, classical or linear (Girard). Semantic functioning of a network is its ability to express theories in such a language for answering questions in output about input data. Quantities and spaces of semantic information are defined by analogy with the homological interpretation of Shannon's entropy (P.Baudot and D.B. 2015). They generalize the measures found by Carnap and Bar-Hillel (1952). Amazingly, the above semantical structures are classified by geometric fibrant objects in a closed model category of Quillen, then they give rise to homotopical invariants of DNNs and of their semantic functioning. Intentional type theories (Martin-Loef) organize these objects and fibrations between them. Information contents and exchanges are analyzed by Grothendieck's derivators.

Related papers

Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures [49.19753720526998]
We derive theoretical scaling laws for neural network performance on synthetic datasets.<n>We validate that convolutional networks, whose structure aligns with that of the generative process through locality and weight sharing, enjoy a faster scaling of performance.<n>This finding clarifies the architectural biases underlying neural scaling laws and highlights how representation learning is shaped by the interaction between model architecture and the statistical properties of data.
arXiv Detail & Related papers (2025-05-11T17:44:14Z)
Probing Internal Representations of Multi-Word Verbs in Large Language Models [0.0]
This study investigates the internal representations of verb-particle combinations, called multi-word verbs, within large language models (LLMs) We analyze the representations of its layers for two different verb-particle constructions: phrasal verbs like 'give up' and prepositional verbs like 'look at'
arXiv Detail & Related papers (2025-02-07T09:49:13Z)
Explainable Moral Values: a neuro-symbolic approach to value classification [1.4186974630564675]
This work explores the integration of ontology-based reasoning and Machine Learning techniques for explainable value classification. By relying on an ontological formalization of moral values as in the Moral Foundations Theory, the textitsandra neuro-symbolic reasoner is used to infer values that are emphsatisfied by a certain sentence. We show that only relying on the reasoner's inference results in explainable classification comparable to other more complex approaches.
arXiv Detail & Related papers (2024-10-16T14:53:13Z)
Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training. It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby. It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z)
A rank decomposition for the topological classification of neural representations [0.0]
In this work, we leverage the fact that neural networks are equivalent to continuous piecewise-affine maps. We study the homology groups of the quotient of a manifold $mathcalM$ and a subset $A$, assuming some minimal properties on these spaces. We show that in randomly narrow networks, there will be regions in which the (co)homology groups of a data manifold can change.
arXiv Detail & Related papers (2024-04-30T17:01:20Z)
Agentivit\`a e telicit\`a in GilBERTo: implicazioni cognitive [77.71680953280436]
The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics. The semantic properties considered are telicity (also combined with definiteness) and agentivity.
arXiv Detail & Related papers (2023-07-06T10:52:22Z)
Lattice-preserving $\mathcal{ALC}$ ontology embeddings with saturation [50.05281461410368]
An order-preserving embedding method is proposed to generate embeddings of OWL representations. We show that our method outperforms state-the-art theory-of-the-art embedding methods in several knowledge base completion tasks.
arXiv Detail & Related papers (2023-05-11T22:27:51Z)
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure" We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z)
Towards Rigorous Understanding of Neural Networks via Semantics-preserving Transformations [0.0]
We present an approach to the precise and global verification and explanation of Rectifier Neural Networks. Key to our approach is the symbolic execution of these networks that allows the construction of semantically equivalent Typed Affine Decision Structures.
arXiv Detail & Related papers (2023-01-19T11:35:07Z)
Equivariant Transduction through Invariant Alignment [71.45263447328374]
We introduce a novel group-equivariant architecture that incorporates a group-in hard alignment mechanism. We find that our network's structure allows it to develop stronger equivariant properties than existing group-equivariant approaches. We additionally find that it outperforms previous group-equivariant networks empirically on the SCAN task.
arXiv Detail & Related papers (2022-09-22T11:19:45Z)
Neural network layers as parametric spans [0.0]
We present a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional) while guaranteeing existence and computability of the layer's derivatives for backpropagation.
arXiv Detail & Related papers (2022-08-01T12:41:22Z)
Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication [0.0]
We use the ciwGAN architecture Beguvs in which learning of meaningful representations in speech emerges from a requirement that the CNNs generate informative data. We propose a technique to wug-test CNNs trained on speech and, based on four generative tests, argue that the network learns to represent an identity-based pattern in its latent space.
arXiv Detail & Related papers (2020-09-13T23:12:49Z)
Algebraic Neural Networks: Stability to Deformations [179.55535781816343]
We study algebraic neural networks (AlgNNs) with commutative algebras. AlgNNs unify diverse architectures such as Euclidean convolutional neural networks, graph neural networks, and group neural networks.
arXiv Detail & Related papers (2020-09-03T03:41:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.