Related papers: Meaning-infused grammar: Gradient Acceptability Shapes the Geometric Representations of Constructions in LLMs

Meaning-infused grammar: Gradient Acceptability Shapes the Geometric Representations of Constructions in LLMs

URL: http://arxiv.org/abs/2507.22286v2
Date: Mon, 08 Sep 2025 18:33:19 GMT
Title: Meaning-infused grammar: Gradient Acceptability Shapes the Geometric Representations of Constructions in LLMs
Authors: Supantho Rakshit, Adele Goldberg,
Abstract summary: This study investigates whether the internal representations in Large Language Models (LLMs) reflect the proposed function-infused gradience.<n>We analyze representations of the English Double Object (DO) and Prepositional Object (PO) constructions in Pythia-$1.4$B.<n> Geometric analyses show that the separability between the two constructions' representations is systematically modulated by gradient preference strength.
Score: 0.8594140167290097
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The usage-based constructionist (UCx) approach to language posits that language comprises a network of learned form-meaning pairings (constructions) whose use is largely determined by their meanings or functions, requiring them to be graded and probabilistic. This study investigates whether the internal representations in Large Language Models (LLMs) reflect the proposed function-infused gradience. We analyze representations of the English Double Object (DO) and Prepositional Object (PO) constructions in Pythia-$1.4$B, using a dataset of $5000$ sentence pairs systematically varied by human-rated preference strength for DO or PO. Geometric analyses show that the separability between the two constructions' representations, as measured by energy distance or Jensen-Shannon divergence, is systematically modulated by gradient preference strength, which depends on lexical and functional properties of sentences. That is, more prototypical exemplars of each construction occupy more distinct regions in activation space, compared to sentences that could have equally well have occured in either construction. These results provide evidence that LLMs learn rich, meaning-infused, graded representations of constructions and offer support for geometric measures for representations in LLMs.

Related papers

Emergent Structured Representations Support Flexible In-Context Inference in Large Language Models [77.98801218316505]
Large language models (LLMs) exhibit emergent behaviors suggestive of human-like reasoning.<n>We investigate the internal processing of LLMs during in-context concept inference.
arXiv Detail & Related papers (2026-02-08T03:14:39Z)
Understanding Syntactic Generalization in Structure-inducing Language Models [15.419603273515786]
Structure-inducing Language Models (SiLM) are trained on a self-supervised language modeling task.<n>SiLMs induce a hierarchical sentence representation as a byproduct when processing an input.<n>We study three different SiLM architectures using both natural language (English) corpora and synthetic bracketing expressions.
arXiv Detail & Related papers (2025-08-11T13:29:41Z)
Vector Ontologies as an LLM world view extraction method [0.0]
Large Language Models (LLMs) possess intricate internal representations of the world, yet these structures are notoriously difficult to interpret or repurpose beyond the original prediction task.<n>A vector ontology defines a domain-specific vector space spanned by ontologically meaningful dimensions, allowing geometric analysis of concepts and relationships within a domain.<n>Using GPT-4o-mini, we extract genre representations through multiple natural language prompts and analyze the consistency of these projections across linguistic variations and their alignment with ground-truth data.
arXiv Detail & Related papers (2025-06-16T08:49:21Z)
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities [12.065600268467556]
Finetuning of Large Language Models (LLMs) has significantly advanced their instruction-following capabilities.<n>This study examines how fine-tuning reconfigures LLM computations by isolating and analyzing instruction-specific sparse components.
arXiv Detail & Related papers (2025-05-27T13:40:28Z)
When can isotropy help adapt LLMs' next word prediction to numerical domains? [53.98633183204453]
It is shown that the isotropic property of LLM embeddings in contextual embedding space preserves the underlying structure of representations.<n> Experiments show that different characteristics of numerical data and model architectures have different impacts on isotropy.
arXiv Detail & Related papers (2025-05-22T05:10:34Z)
Causal Interventions Reveal Shared Structure Across English Filler-Gap Constructions [35.29123099176241]
We argue that causal interpretability methods, applied to Language Models, can greatly enhance the value of such evidence.<n>Our empirical focus is a set of English filler-gap dependency constructions.<n>We show that LMs converge on similar abstract analyses of these constructions.
arXiv Detail & Related papers (2025-05-21T20:37:57Z)
MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams [65.02628814094639]
Diagrams serve as a fundamental form of visual language, representing complex concepts and their inter-relationships through structured symbols, shapes, and spatial arrangements.<n>Current benchmarks conflate perceptual and reasoning tasks, making it difficult to assess whether Multimodal Large Language Models genuinely understand mathematical diagrams beyond superficial pattern recognition.<n>We introduce MATHGLANCE, a benchmark specifically designed to isolate and evaluate mathematical perception in MLLMs.<n>We construct GeoPeP, a perception-oriented dataset of 200K structured geometry image-text annotated with geometric primitives and precise spatial relationships.
arXiv Detail & Related papers (2025-03-26T17:30:41Z)
Do Large Language Models Truly Understand Geometric Structures? [15.915781154075615]
We introduce the GeomRel dataset to evaluate large language models' understanding of geometric structures.<n>We propose the Geometry Chain-of-Thought (GeoCoT) method, which enhances LLMs' ability to identify geometric relationships.
arXiv Detail & Related papers (2025-01-23T15:52:34Z)
Exploring the Role of Reasoning Structures for Constructing Proofs in Multi-Step Natural Language Reasoning with Large Language Models [30.09120709652445]
This paper is centred around a focused study: whether the current state-of-the-art generalist LLMs can leverage the structures in a few examples to better construct the proof structures with textitin-context learning.
arXiv Detail & Related papers (2024-10-11T00:45:50Z)
Reasoning in Large Language Models: A Geometric Perspective [4.2909314120969855]
We explore the reasoning abilities of large language models (LLMs) through their geometrical understanding. We establish a connection between the expressive power of LLMs and the density of their self-attention graphs.
arXiv Detail & Related papers (2024-07-02T21:39:53Z)
Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability. The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning [50.00090601424348]
Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks. We propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.
arXiv Detail & Related papers (2023-11-13T06:13:38Z)
"You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of Abstract Meaning Representation [60.863629647985526]
We examine the successes and limitations of the GPT-3, ChatGPT, and GPT-4 models in analysis of sentence meaning structure. We find that models can reliably reproduce the basic format of AMR, and can often capture core event, argument, and modifier structure. Overall, our findings indicate that these models out-of-the-box can capture aspects of semantic structure, but there remain key limitations in their ability to support fully accurate semantic analyses or parses.
arXiv Detail & Related papers (2023-10-26T21:47:59Z)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs) We first present a framework for understanding compositional structures from a geometric perspective. We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z)
Are Representations Built from the Ground Up? An Empirical Examination of Local Composition in Language Models [91.3755431537592]
Representing compositional and non-compositional phrases is critical for language understanding. We first formulate a problem of predicting the LM-internal representations of longer phrases given those of their constituents. While we would expect the predictive accuracy to correlate with human judgments of semantic compositionality, we find this is largely not the case.
arXiv Detail & Related papers (2022-10-07T14:21:30Z)
The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning [62.601681746034956]
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision. We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each.
arXiv Detail & Related papers (2022-09-18T18:15:38Z)
A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.<n>We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z)
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models [22.43510769150502]
It is not entirely clear what aspects of sentence-level syntax are captured by vector-based language representations. We show that Transformers build sensitivity to larger parts of the sentence along their layers, and that hierarchical phrase structure plays a role in this process.
arXiv Detail & Related papers (2021-04-15T16:30:31Z)
Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space. We present our approach of constructing analogy datasets in terms of words, phrases and sentences. We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.