Related papers: Answer-Centric or Reasoning-Driven? Uncovering the Latent Memory Anchor in LLMs

Answer-Centric or Reasoning-Driven? Uncovering the Latent Memory Anchor in LLMs

URL: http://arxiv.org/abs/2506.17630v1
Date: Sat, 21 Jun 2025 08:15:45 GMT
Title: Answer-Centric or Reasoning-Driven? Uncovering the Latent Memory Anchor in LLMs
Authors: Yang Wu, Yifan Zhang, Yiwei Wang, Yujun Cai, Yurong Wu, Yuran Wang, Ning Xu, Jian Cheng,
Abstract summary: Large Language Models (LLMs) demonstrate impressive reasoning capabilities.<n>Evidence suggests much of their success stems from memorized answer-reasoning patterns rather than genuine inference.<n>We propose a five-level answer-visibility prompt framework that systematically manipulates answer cues and probes model behavior through indirect, behavioral analysis.
Score: 28.556628696390767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While Large Language Models (LLMs) demonstrate impressive reasoning capabilities, growing evidence suggests much of their success stems from memorized answer-reasoning patterns rather than genuine inference. In this work, we investigate a central question: are LLMs primarily anchored to final answers or to the textual pattern of reasoning chains? We propose a five-level answer-visibility prompt framework that systematically manipulates answer cues and probes model behavior through indirect, behavioral analysis. Experiments across state-of-the-art LLMs reveal a strong and consistent reliance on explicit answers. The performance drops by 26.90\% when answer cues are masked, even with complete reasoning chains. These findings suggest that much of the reasoning exhibited by LLMs may reflect post-hoc rationalization rather than true inference, calling into question their inferential depth. Our study uncovers the answer-anchoring phenomenon with rigorous empirical validation and underscores the need for a more nuanced understanding of what constitutes reasoning in LLMs.

Related papers

Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering [9.601307470705732]
We propose Self-Reflective Planning (SRP), a framework that synergizes large language models with knowledge graphs.<n>In the planning process, SRP first searches for references to guide planning and reflection.<n>After retrieving knowledge from KGs through a reasoning path, it implements iterative reflection by judging the retrieval result and editing the reasoning path until the answer is correctly retrieved.
arXiv Detail & Related papers (2025-05-26T01:59:00Z)
Reasoning Capabilities and Invariability of Large Language Models [49.23570751696334]
We aim to provide a comprehensive analysis of Large Language Models' reasoning competence.<n>We introduce a new benchmark dataset with a series of simple reasoning questions demanding shallow logical reasoning.<n>An empirical analysis involving zero-shot and few-shot prompting across 24 LLMs of different sizes reveals that, while LLMs with over 70 billion parameters perform better in the zero-shot setting, there is still a large room for improvement.
arXiv Detail & Related papers (2025-05-01T18:12:30Z)
No Need for Explanations: LLMs can implicitly learn from mistakes in-context [14.508050809497847]
We study why Large Language Models learn from mistakes more effectively without explicit corrective rationales.<n>We find evidence that, while incorrect answers are more beneficial for LLM learning, explicit corrective rationales over-constrain the model.
arXiv Detail & Related papers (2025-02-12T16:31:21Z)
Inductive or Deductive? Rethinking the Fundamental Reasoning Abilities of LLMs [99.76347807139615]
Reasoning encompasses two typical types: deductive reasoning and inductive reasoning. Despite extensive research into the reasoning capabilities of Large Language Models (LLMs), most studies have failed to rigorously differentiate between inductive and deductive reasoning. This raises an essential question: In LLM reasoning, which poses a greater challenge - deductive or inductive reasoning?
arXiv Detail & Related papers (2024-07-31T18:47:11Z)
Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models [55.332004960574004]
Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established.<n>This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt.<n>We propose a prompt-response concept model that explains how LLMs generate responses and helps understand the relationship between prompts and response uncertainty.
arXiv Detail & Related papers (2024-07-20T11:19:58Z)
Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance [38.49506722997423]
Large language models (LLMs) have developed impressive performance and strong explainability across various reasoning scenarios. LLMs often struggle to abstract and apply the generic fact to provide consistent and precise answers. This has sparked a vigorous debate about whether LLMs are genuinely reasoning or merely memorizing.
arXiv Detail & Related papers (2024-03-14T04:06:13Z)
Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge Graphs [52.42505579545893]
Large language models (LLMs) demonstrate strong reasoning abilities when prompted to generate chain-of-thought explanations alongside answers. We propose a novel discriminative and generative CoT evaluation paradigm to assess LLMs' knowledge of reasoning and the accuracy of the generated CoT.
arXiv Detail & Related papers (2024-02-17T05:22:56Z)
Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? [73.454943870226]
This work studies a specific type of hallucination induced by semantic associations. To quantify this phenomenon, we propose a novel probing method and benchmark called EureQA.
arXiv Detail & Related papers (2023-11-16T09:27:36Z)
Language Models with Rationality [57.37201135072838]
Large language models (LLMs) are proficient at question-answering (QA) It is not always clear how (or even if) an answer follows from their latent "beliefs"
arXiv Detail & Related papers (2023-05-23T17:04:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.