Related papers: On the Notion that Language Models Reason

On the Notion that Language Models Reason

URL: http://arxiv.org/abs/2511.11810v1
Date: Fri, 14 Nov 2025 19:04:24 GMT
Title: On the Notion that Language Models Reason
Authors: Bertram Højer,
Abstract summary: Language models (LMs) are said to be exhibiting reasoning, but what does this entail?<n>We argue that the definitions provided are not consistent with how LMs are trained, process information, and generate new tokens.<n>This view is illustrative of the claim that LMs are "statistical pattern matchers"
Score: 2.9612444540570113
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models (LMs) are said to be exhibiting reasoning, but what does this entail? We assess definitions of reasoning and how key papers in the field of natural language processing (NLP) use the notion and argue that the definitions provided are not consistent with how LMs are trained, process information, and generate new tokens. To illustrate this incommensurability we assume the view that transformer-based LMs implement an \textit{implicit} finite-order Markov kernel mapping contexts to conditional token distributions. In this view, reasoning-like outputs correspond to statistical regularities and approximate statistical invariances in the learned kernel rather than the implementation of explicit logical mechanisms. This view is illustrative of the claim that LMs are "statistical pattern matchers"" and not genuine reasoners and provides a perspective that clarifies why reasoning-like outputs arise in LMs without any guarantees of logical consistency. This distinction is fundamental to how epistemic uncertainty is evaluated in LMs. We invite a discussion on the importance of how the computational processes of the systems we build and analyze in NLP research are described.

Related papers

Beyond Correctness: Exposing LLM-generated Logical Flaws in Reasoning via Multi-step Automated Theorem Proving [11.24425572063955]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities, leading to their adoption in high-stakes domains such as healthcare, law, and scientific research.<n>They often contain subtle logical errors masked by fluent language, posing significant risks for critical applications.<n>We present MATP, an evaluation framework for systematically verifying LLM reasoning via Multi-step Automatic Theorem Proving.
arXiv Detail & Related papers (2025-12-29T14:48:15Z)
Are Language Models Efficient Reasoners? A Perspective from Logic Programming [109.47572890883248]
Modern language models (LMs) exhibit strong deductive reasoning capabilities, yet standard evaluations emphasize correctness while overlooking a key aspect of human-like reasoning: efficiency.<n>We propose a framework for assessing LM reasoning efficiency through the lens of logic programming.
arXiv Detail & Related papers (2025-10-29T15:30:31Z)
Framework for Machine Evaluation of Reasoning Completeness in Large Language Models For Classification Tasks [0.0]
This paper introduces RACE-Reasoning Alignment for Completeness of Explanations.<n>We analyze four widely used text classification datasets-WIKI ONTOLOGY, AG NEWS, IMDB, and GOEMOTIONS.<n>We show that correct predictions exhibit higher coverage of supporting features, while incorrect predictions are associated with elevated coverage of contradicting features.
arXiv Detail & Related papers (2025-10-23T20:22:22Z)
Implicit Reasoning in Large Language Models: A Comprehensive Survey [67.53966514728383]
Large Language Models (LLMs) have demonstrated strong generalization across a wide range of tasks.<n>Recent studies have shifted attention from explicit chain-of-thought prompting toward implicit reasoning.<n>This survey introduces a taxonomy centered on execution paradigms, shifting the focus from representational forms to computational strategies.
arXiv Detail & Related papers (2025-09-02T14:16:02Z)
Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations [87.68633031231924]
Post-hoc explanation methods provide interpretation by attributing predictions to input features.<n>Do these explanations unintentionally reverse the natural relationship between inputs and outputs?<n>We propose Inversion Quantification (IQ), a framework that quantifies the degree to which explanations rely on outputs and deviate from faithful input-output relationships.
arXiv Detail & Related papers (2025-04-11T19:00:12Z)
Computation Mechanism Behind LLM Position Generalization [59.013857707250814]
Large language models (LLMs) exhibit flexibility in handling textual positions.<n>They can understand texts with position perturbations and generalize to longer texts.<n>This work connects the linguistic phenomenon with LLMs' computational mechanisms.
arXiv Detail & Related papers (2025-03-17T15:47:37Z)
Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment [21.12989936864145]
Chain-of-Thought (CoT) prompting has shown promise in enhancing the reasoning capabilities of large language models (LLMs)<n>We propose Reasoning-as-Logic-Units (RaLU), which constructs a more reliable reasoning path by aligning logical units between the generated program and their corresponding NL descriptions.
arXiv Detail & Related papers (2025-02-05T08:23:18Z)
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning [87.73401758641089]
Chain-of-thought (CoT) reasoning has improved the performance of modern language models (LMs)<n>We show that LMs can represent the same family of distributions over strings as probabilistic Turing machines.
arXiv Detail & Related papers (2024-06-20T10:59:02Z)
Evaluating Step-by-Step Reasoning through Symbolic Verification [20.156768135017007]
Pre-trained language models (LMs) have shown remarkable reasoning performance for in-context learning. LMLP enjoys more than $25%$ higher accuracy than chain-of-thoughts (CoT) on length generalization benchmarks even with smaller model sizes.
arXiv Detail & Related papers (2022-12-16T19:30:01Z)
Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals. It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation. It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.