Related papers: Directional Non-Commutative Monoidal Structures for Compositional Embeddings in Machine Learning

Directional Non-Commutative Monoidal Structures for Compositional Embeddings in Machine Learning

URL: http://arxiv.org/abs/2505.15507v1
Date: Wed, 21 May 2025 13:27:14 GMT
Title: Directional Non-Commutative Monoidal Structures for Compositional Embeddings in Machine Learning
Authors: Mahesh Godavarti,
Abstract summary: We introduce a new structure for compositional embeddings built on directional non-commutative monoidal operators.<n>Our construction defines a distinct composition operator circ_i for each axis i, ensuring associative combination along each axis without imposing global commutativity.<n>All axis-specific operators commute with one another, enforcing a global interchange law that enables consistent crossaxis compositions.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We introduce a new algebraic structure for multi-dimensional compositional embeddings, built on directional non-commutative monoidal operators. The core contribution of this work is this novel framework, which exhibits appealing theoretical properties (associativity along each dimension and an interchange law ensuring global consistency) while remaining compatible with modern machine learning architectures. Our construction defines a distinct composition operator circ_i for each axis i, ensuring associative combination along each axis without imposing global commutativity. Importantly, all axis-specific operators commute with one another, enforcing a global interchange law that enables consistent crossaxis compositions. This is, to our knowledge, the first approach that provides a common foundation that generalizes classical sequence-modeling paradigms (e.g., structured state-space models (SSMs) and transformer self-attention) to a unified multi-dimensional framework. For example, specific one-dimensional instances of our framework can recover the familiar affine transformation algebra, vanilla self-attention, and the SSM-style recurrence. The higher-dimensional generalizations naturally support recursive, structure-aware operations in embedding spaces. We outline several potential applications unlocked by this structure-including structured positional encodings in Transformers, directional image embeddings, and symbolic modeling of sequences or grids-indicating that it could inform future deep learning model designs. We formally establish the algebraic properties of our framework and discuss efficient implementations. Finally, as our focus is theoretical, we include no experiments here and defer empirical validation to future work, which we plan to undertake.

Related papers

Cross-Model Semantics in Representation Learning [1.2064681974642195]
We show that structural regularities induce representational geometry that is more stable under architectural variation.<n>This suggests that certain forms of inductive bias not only support generalization within a model, but also improve the interoperability of learned features across models.
arXiv Detail & Related papers (2025-08-05T16:57:24Z)
Hierarchical Modeling and Architecture Optimization: Review and Unified Framework [0.6291443816903801]
This paper reviews literature on structured input spaces and proposes a unified framework that generalizes existing approaches.<n>A variable is described as meta if its value governs the presence of other decreed variables, enabling the modeling of conditional and hierarchical structures.
arXiv Detail & Related papers (2025-06-27T20:38:57Z)
Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation for Neurosymbolic Reasoning [73.18052192964349]
We develop a theoretical framework that explains how discrete symbolic structures can emerge naturally from continuous neural network training dynamics.<n>By lifting neural parameters to a measure space and modeling training as Wasserstein gradient flow, we show that under geometric constraints, the parameter measure $mu_t$ undergoes two concurrent phenomena.
arXiv Detail & Related papers (2025-06-26T22:40:30Z)
Directional Non-Commutative Monoidal Structures with Interchange Law via Commutative Generators [0.0]
We introduce a class of algebraic structures that generalize one-dimensional monoidal systems into higher dimensions.<n>We show that the framework that unifies several well-known linear transforms in signal processing and data analysis.
arXiv Detail & Related papers (2025-05-30T12:40:01Z)
The Coverage Principle: A Framework for Understanding Compositional Generalization [31.762330857169914]
We show that models relying primarily on pattern matching for compositional tasks cannot reliably generalize beyond substituting fragments that yield identical results when used in the same contexts.<n>We demonstrate that this framework has a strong predictive power for the generalization capabilities of Transformers.
arXiv Detail & Related papers (2025-05-26T17:55:15Z)
MIND: Microstructure INverse Design with Generative Hybrid Neural Representation [25.55691106041371]
inverse design of microstructures plays a pivotal role in optimizing metamaterials with specific, targeted physical properties.<n>We present a novel generative model that integrates latent diffusion with Holoplane, an advanced hybrid neural representation that simultaneously encodes both geometric and physical properties.<n>Our approach generalizes across multiple microstructure classes, enabling the generation of diverse, tileable microstructures with significantly improved property accuracy and enhanced control over geometric validity.
arXiv Detail & Related papers (2025-02-01T20:25:47Z)
Self-Attention as a Parametric Endofunctor: A Categorical Framework for Transformer Architectures [0.0]
We develop a category-theoretic framework focusing on the linear components of self-attention.<n>We show that the query, key, and value maps naturally define a parametric 1-morphism in the 2-category $mathbfPara(Vect)$.<n> stacking multiple self-attention layers corresponds to constructing the free monad on this endofunctor.
arXiv Detail & Related papers (2025-01-06T11:14:18Z)
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices [88.33936714942996]
We present a unifying framework that enables searching among all linear operators expressible via an Einstein summation. We show that differences in the compute-optimal scaling laws are mostly governed by a small number of variables. We find that Mixture-of-Experts (MoE) learns an MoE in every single linear layer of the model, including the projection in the attention blocks.
arXiv Detail & Related papers (2024-10-03T00:44:50Z)
Symmetry-Based Structured Matrices for Efficient Approximately Equivariant Networks [5.187307904567701]
Group Matrices (GMs) are forgotten precursor to modern notion of regular representations of finite groups.<n>We show GMs can generalize classical LDR theory to general discrete groups.<n>Our framework performs competitively with approximately equivariant NNs and other structured matrix-based methods.
arXiv Detail & Related papers (2024-09-18T07:52:33Z)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs) We first present a framework for understanding compositional structures from a geometric perspective. We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z)
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z)
Compositional Generalisation with Structured Reordering and Fertility Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation. We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z)
Frame Averaging for Equivariant Shape Space Learning [85.42901997467754]
A natural way to incorporate symmetries in shape space learning is to ask that the mapping to the shape space (encoder) and mapping from the shape space (decoder) are equivariant to the relevant symmetries. We present a framework for incorporating equivariance in encoders and decoders by introducing two contributions.
arXiv Detail & Related papers (2021-12-03T06:41:19Z)
Remixing Functionally Graded Structures: Data-Driven Topology Optimization with Multiclass Shape Blending [15.558093285161775]
We propose a data-driven framework for multiclass functionally graded structures. The key is a new multiclass shape blending scheme that generates smoothly graded microstructures. It transforms the microscale problem into an efficient, low-dimensional one without confining the design to predefined shapes.
arXiv Detail & Related papers (2021-12-01T16:54:56Z)
K\"ahler Geometry of Quiver Varieties and Machine Learning [0.0]
We develop an algebro-geometric formulation for neural networks in machine learning using the moduli space of framed representations quiver. We prove the universal approximation theorem for the multi-variable activation function constructed from the complex projective space.
arXiv Detail & Related papers (2021-01-27T15:32:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.