A quantum semantic framework for natural language processing
- URL: http://arxiv.org/abs/2506.10077v2
- Date: Tue, 15 Jul 2025 00:08:11 GMT
- Title: A quantum semantic framework for natural language processing
- Authors: Christopher J. Agostino, Quan Le Thien, Molly Apsel, Denizhan Pak, Elina Lesyk, Ashabari Majumdar,
- Abstract summary: We argue that semantic degeneracy imposes fundamental limitations on modern NLP systems, because they operate within natural language itself.<n>We show that as an expression's complexity grows, the amount of contextual information required to reliably resolve its ambiguity explodesly.<n>We argue that meaning is dynamically actualized through an observer-dependent interpretive act, a process whose non-deterministic nature is most appropriately described by a non-classical, quantum-like logic.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semantic degeneracy represents a fundamental property of natural language that extends beyond simple polysemy to encompass the combinatorial explosion of potential interpretations that emerges as semantic expressions increase in complexity. In this work, we argue this property imposes fundamental limitations on Large Language Models (LLMs) and other modern NLP systems, precisely because they operate within natural language itself. Using Kolmogorov complexity, we demonstrate that as an expression's complexity grows, the amount of contextual information required to reliably resolve its ambiguity explodes combinatorially. The computational intractability of recovering a single intended meaning for complex or ambiguous text therefore suggests that the classical view that linguistic forms possess intrinsic meaning in and of themselves is conceptually inadequate. We argue instead that meaning is dynamically actualized through an observer-dependent interpretive act, a process whose non-deterministic nature is most appropriately described by a non-classical, quantum-like logic. To test this hypothesis, we conducted a semantic Bell inequality test using diverse LLM agents. Our experiments yielded average CHSH expectation values from 1.2 to 2.8, with several runs producing values (e.g., 2.3-2.4) in significant violation of the classical boundary ($|S|\leq2$), demonstrating that linguistic interpretation under ambiguity can exhibit non-classical contextuality, consistent with results from human cognition experiments. These results inherently imply that classical frequentist-based analytical approaches for natural language are necessarily lossy. Instead, we propose that Bayesian-style repeated sampling approaches can provide more practically useful and appropriate characterizations of linguistic meaning in context.
Related papers
- Ensemble Kalman filter for uncertainty in human language comprehension [39.781091151259766]
We propose a Bayesian framework for sentence comprehension, applying an extension of the ensemble Kalman filter (EnKF) for Bayesian inference to quantify uncertainty.<n>By framing language comprehension as a Bayesian inverse problem, this approach enhances the SG model's ability to reflect human sentence processing with respect to the representation of uncertainty.
arXiv Detail & Related papers (2025-05-05T11:56:12Z) - Large Language Models as Quasi-crystals: Coherence Without Repetition in Generative Text [0.0]
essay proposes an analogy between large language models (LLMs) and quasicrystals, systems that exhibit global coherence without periodic repetition, generated through local constraints.<n> Drawing on the history of quasicrystals, it highlights an alternative mode of coherence in generative language: constraint-based organization without repetition or symbolic intent.<n>This essay aims to reframe the current discussion around large language models, not by rejecting existing methods, but by suggesting an additional axis of interpretation grounded in structure rather than semantics.
arXiv Detail & Related papers (2025-04-16T11:27:47Z) - Investigating Idiomaticity in Word Representations [9.208145117062339]
We focus on noun compounds of varying levels of idiomaticity in two languages (English and Portuguese)
We present a dataset of minimal pairs containing human idiomaticity judgments for each noun compound at both type and token levels.
We define a set of fine-grained metrics of Affinity and Scaled Similarity to determine how sensitive the models are to perturbations that may lead to changes in idiomaticity.
arXiv Detail & Related papers (2024-11-04T21:05:01Z) - Scope Ambiguities in Large Language Models [29.427341094882966]
We investigate how different versions of certain autoregressive language models treat scope ambiguous sentences.
We introduce novel datasets that contain a joint total of almost 1,000 unique scope-ambiguous sentences.
Using these datasets, we find evidence that several models are sensitive to the meaning ambiguity in these sentences.
arXiv Detail & Related papers (2024-04-05T18:01:02Z) - "You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of
Abstract Meaning Representation [60.863629647985526]
We examine the successes and limitations of the GPT-3, ChatGPT, and GPT-4 models in analysis of sentence meaning structure.
We find that models can reliably reproduce the basic format of AMR, and can often capture core event, argument, and modifier structure.
Overall, our findings indicate that these models out-of-the-box can capture aspects of semantic structure, but there remain key limitations in their ability to support fully accurate semantic analyses or parses.
arXiv Detail & Related papers (2023-10-26T21:47:59Z) - Large Language Models Are Partially Primed in Pronoun Interpretation [6.024776891570197]
We investigate whether large language models (LLMs) display human-like referential biases using stimuli and procedures from real psycholinguistic experiments.
Recent psycholinguistic studies suggest that humans adapt their referential biases with recent exposure to referential patterns.
We find that InstructGPT adapts its pronominal interpretations in response to the frequency of referential patterns in the local discourse.
arXiv Detail & Related papers (2023-05-26T13:30:48Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Natural Language Decompositions of Implicit Content Enable Better Text Representations [52.992875653864076]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.<n>We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.<n>Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - Evaluating statistical language models as pragmatic reasoners [39.72348730045737]
We evaluate the capacity of large language models to infer meanings of pragmatic utterances.
We find that LLMs can derive context-grounded, human-like distributions over the interpretations of several complex pragmatic utterances.
Results inform the inferential capacity of statistical language models, and their use in pragmatic and semantic parsing applications.
arXiv Detail & Related papers (2023-05-01T18:22:10Z) - We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding.
We characterize ambiguity in a sentence by its effect on entailment relations with another sentence.
We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z) - Can Transformers Reason in Fragments of Natural Language? [2.1485350418225244]
State-of-the-art deep-learning-based approaches to Natural Language Processing (NLP) are credited with various capabilities that involve reasoning with natural language texts.
We study the detection of formally valid inferences in controlled fragments of natural language for which the satisfiability problem becomes increasingly complex.
arXiv Detail & Related papers (2022-11-10T08:46:53Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Admissibility in Strength-based Argumentation: Complexity and Algorithms
(Extended Version with Proofs) [1.5828697880068698]
We study the adaptation of admissibility-based semantics to Strength-based Argumentation Frameworks (StrAFs)
Especially, we show that the strong admissibility defined in the literature does not satisfy a desirable property, namely Dung's fundamental lemma.
We propose a translation in pseudo-Boolean constraints for computing (strong and weak) extensions.
arXiv Detail & Related papers (2022-07-05T18:42:04Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - The King is Naked: on the Notion of Robustness for Natural Language
Processing [18.973116252065278]
We argue for semantic robustness, which is better aligned with the human concept of linguistic fidelity.
We study semantic robustness of a range of vanilla and robustly trained architectures using a template-based generative test bed.
arXiv Detail & Related papers (2021-12-13T16:19:48Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.