Related papers: Perspectives on Large Language Models: Polysemy, Stochasticity, Exponential Expressibility, and Unitary Attention

Related papers

Discrete Semantic States and Hamiltonian Dynamics in LLM Embedding Spaces [0.0]
We investigate the structure of Large Language Model embedding spaces using mathematical concepts, particularly linear algebra and the Hamiltonian formalism.<n>Motivated by the observation that LLM embeddings exhibit distinct states, we explore the application of these mathematical tools to analyze semantic relationships.
arXiv Detail & Related papers (2025-12-29T15:01:43Z)
Quantum LLMs Using Quantum Computing to Analyze and Process Semantic Information [0.0]
We present a quantum computing approach to analyzing Large Language Model embeddings.<n>We leverage complex-valued representations and modeling semantic relationships using quantum mechanical principles.
arXiv Detail & Related papers (2025-12-02T10:28:05Z)
A Free Probabilistic Framework for Analyzing the Transformer-based Language Models [19.78896931593813]
We present a formal operator-theoretic framework for analyzing Transformer-based language models using free probability theory.<n>This work offers a principled, though theoretical, perspective on structural dynamics in large language models.
arXiv Detail & Related papers (2025-06-19T19:13:02Z)
The Origins of Representation Manifolds in Large Language Models [52.68554895844062]
We show that cosine similarity in representation space may encode the intrinsic geometry of a feature through shortest, on-manifold paths.<n>The critical assumptions and predictions of the theory are validated on text embeddings and token activations of large language models.
arXiv Detail & Related papers (2025-05-23T13:31:22Z)
Domain Embeddings for Generating Complex Descriptions of Concepts in Italian Language [65.268245109828]
We propose a Distributional Semantic resource enriched with linguistic and lexical information extracted from electronic dictionaries. The resource comprises 21 domain-specific matrices, one comprehensive matrix, and a Graphical User Interface. Our model facilitates the generation of reasoned semantic descriptions of concepts by selecting matrices directly associated with concrete conceptual knowledge.
arXiv Detail & Related papers (2024-02-26T15:04:35Z)
The Quantum Monadology [0.0]
Modern theory of functional programming languages uses monads for encoding computational side-effects and side-contexts. We analyze the (co)monads on categories of parameterized module spectra induced by Grothendieck's "motivic yoga of operations" We indicate a domain-specific quantum programming language (QS) expressing these monadic quantum effects in transparent do-notation.
arXiv Detail & Related papers (2023-10-24T11:19:24Z)
Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions [5.763375492057694]
This paper presents a multi-relational model that explicitly leverages such a structure to derive word embeddings from definitions. An empirical analysis demonstrates that the framework can help imposing the desired structural constraints. Experiments reveal the superiority of the Hyperbolic word embeddings over the Euclidean counterparts.
arXiv Detail & Related papers (2023-05-12T08:16:06Z)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs) We first present a framework for understanding compositional structures from a geometric perspective. We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z)
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings [4.479834103607384]
Adding interpretability to word embeddings represents an area of active research in text representation. We present SensePOLAR, an extension of the original POLAR framework that enables word-sense aware interpretability for pre-trained contextual word embeddings.
arXiv Detail & Related papers (2023-01-11T20:25:53Z)
The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for Deep Quantum Machine Learning [52.77024349608834]
Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing. Key issue is how to address the inherent non-linearity of classical deep learning. We introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning.
arXiv Detail & Related papers (2022-12-22T16:06:24Z)
Lost in Context? On the Sense-wise Variance of Contextualized Word Embeddings [11.475144702935568]
We quantify how much the contextualized embeddings of each word sense vary across contexts in typical pre-trained models. We find that word representations are position-biased, where the first words in different contexts tend to be more similar.
arXiv Detail & Related papers (2022-08-20T12:27:25Z)
PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems. We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models. We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z)
Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning [92.07643510310766]
Temporal grounding in videos aims to localize one target video segment that semantically corresponds to a given query sentence. We introduce a new Compositional Temporal Grounding task and construct two new dataset splits. We empirically find that they fail to generalize to queries with novel combinations of seen words. We propose a variational cross-graph reasoning framework that explicitly decomposes video and language into multiple structured hierarchies.
arXiv Detail & Related papers (2022-03-24T12:55:23Z)
Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs. We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z)
On the Quantum-like Contextuality of Ambiguous Phrases [2.6381163133447836]
We show that meaning combinations in ambiguous phrases can be modelled in the sheaf-theoretic framework for quantum contextuality. Using the framework of Contextuality-by-Default (CbD), we explore the probabilistic variants of these and show that CbD-contextuality is also possible.
arXiv Detail & Related papers (2021-07-19T13:23:42Z)
SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices. We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z)
Topology of Word Embeddings: Singularities Reflect Polysemy [68.8204255655161]
We introduce a topological measure of polysemy based on persistent homology that correlates well with the actual number of meanings of a word. We propose a simple, topologically motivated solution to the SemEval-2010 task on Word Sense Induction & Disambiguation.
arXiv Detail & Related papers (2020-11-18T17:21:51Z)
Dynamic Contextualized Word Embeddings [20.81930455526026]
We introduce dynamic contextualized word embeddings that represent words as a function of both linguistic and extralinguistic context. Based on a pretrained language model (PLM), dynamic contextualized word embeddings model time and social space jointly. We highlight potential application scenarios by means of qualitative and quantitative analyses on four English datasets.
arXiv Detail & Related papers (2020-10-23T22:02:40Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
Context-theoretic Semantics for Natural Language: an Algebraic Framework [0.0]
We present a framework for natural language semantics in which words, phrases and sentences are all represented as vectors. We show that the vector representations of words can be considered as elements of an algebra over a field.
arXiv Detail & Related papers (2020-09-22T13:31:37Z)
Autoregressive Transformer Neural Network for Simulating Open Quantum Systems via a Probabilistic Formulation [5.668795025564699]
We present an approach for tackling open quantum system dynamics. We compactly represent quantum states with autoregressive transformer neural networks. Efficient algorithms have been developed to simulate the dynamics of the Liouvillian superoperator.
arXiv Detail & Related papers (2020-09-11T18:00:00Z)
Nonlinear ISA with Auxiliary Variables for Learning Speech Representations [51.9516685516144]
We introduce a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables. We propose an algorithm that learns unsupervised speech representations whose subspaces are independent.
arXiv Detail & Related papers (2020-07-25T14:53:09Z)
Word Rotator's Distance [50.67809662270474]
Key principle in assessing textual similarity is measuring the degree of semantic overlap between two texts by considering the word alignment. We show that the norm of word vectors is a good proxy for word importance, and their angle is a good proxy for word similarity. We propose a method that first decouples word vectors into their norm and direction, and then computes alignment-based similarity.
arXiv Detail & Related papers (2020-04-30T17:48:42Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.