Why Thinking Hurts? Diagnosing and Rectifying the Reasoning Shift in Foundation Recommender Models
- URL: http://arxiv.org/abs/2602.16587v1
- Date: Wed, 18 Feb 2026 16:38:21 GMT
- Title: Why Thinking Hurts? Diagnosing and Rectifying the Reasoning Shift in Foundation Recommender Models
- Authors: Luankang Zhang, Yonghao Huang, Hang Lv, Mingjia Yin, Liangyue Li, Zulong Chen, Hao Wang, Enhong Chen,
- Abstract summary: We propose a training-free Inference-Time Subspace Alignment framework.<n>By compressing reasoning chains and applying bias-subtracted contrastive decoding, our approach mitigates ungrounded textual drift.<n>Experiments show this effectively calibrates inference, allowing foundation models to leverage reasoning without sacrificing ID-grounded accuracy.
- Score: 44.74420486421283
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Integrating Chain-of-Thought (CoT) reasoning into Semantic ID-based recommendation foundation models (such as OpenOneRec) often paradoxically degrades recommendation performance. We identify the root cause as textual inertia from the General Subspace, where verbose reasoning dominates inference and causes the model to neglect critical Semantic ID. To address this, we propose a training-free Inference-Time Subspace Alignment framework. By compressing reasoning chains and applying bias-subtracted contrastive decoding, our approach mitigates ungrounded textual drift. Experiments show this effectively calibrates inference, allowing foundation models to leverage reasoning without sacrificing ID-grounded accuracy.
Related papers
- ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought [49.203970812338916]
Explicit reasoning chains introduce substantial computational redundancy.<n>Recent latent reasoning methods attempt to mitigate this by compressing reasoning processes into latent space.<n>We propose Rendered CoT-Guided variational Latent Reasoning (ReGuLaR)
arXiv Detail & Related papers (2026-01-30T17:08:06Z) - Latent Chain-of-Thought as Planning: Decoupling Reasoning from Verbalization [9.193078163792427]
Chain-of-Thought (CoT) empowers Large Language Models (LLMs) to tackle complex problems.<n>Recent latent reasoning approaches attempt to optimize efficiency by performing reasoning within continuous hidden states.<n>We introduce PLaT, a framework that reformulates latent reasoning as planning by fundamentally decouple reasoning from verbalization.
arXiv Detail & Related papers (2026-01-29T07:38:18Z) - Structured Reasoning for Large Language Models [59.215789462977206]
We propose Structured Reasoning (SCR), a framework that decouples reasoning trajectories into explicit, evaluable, and trainable components.<n>SCR substantially improves reasoning efficiency and self-verification.<n>Compared with existing reasoning paradigms, it reduces output token length by up to 50%.
arXiv Detail & Related papers (2026-01-12T04:04:01Z) - How Does Prefix Matter in Reasoning Model Tuning? [57.69882799751655]
We fine-tune three R1 series models across three core model capabilities: reasoning (mathematics), coding, safety, and factuality.<n>Results show that prefix-conditioned SFT improves both safety and reasoning performance, yielding up to +6% higher Safe@1 accuracy.
arXiv Detail & Related papers (2026-01-04T18:04:23Z) - Hallucination Detection via Internal States and Structured Reasoning Consistency in Large Language Models [7.18947815679122]
Internal State Probing and Chain-of-Thought Verification are used to detect hallucinations in large language models.<n>We develop a unified framework that bridges the gap between the two methods.<n>Our framework consistently and significantly outperforms strong baselines.
arXiv Detail & Related papers (2025-10-13T15:31:21Z) - Aligning Deep Implicit Preferences by Learning to Reason Defensively [22.548051297731416]
We propose Critique-Driven Reasoning Alignment (CDRA) to bridge the preference inference gap.<n>CDRA reframes alignment from a scalar reward-matching task into a structured reasoning process.<n> Experiments demonstrate that CDRA excels at discovering and aligning with users' true preferences while executing robust reasoning.
arXiv Detail & Related papers (2025-10-13T09:26:47Z) - AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models [62.70575022567081]
We propose AdvChain, an alignment paradigm that teaches models dynamic self-correction through adversarial CoT tuning.<n>Our work establishes a new direction for building more robust and reliable reasoning models.
arXiv Detail & Related papers (2025-09-29T04:27:23Z) - Abductive Commonsense Reasoning Exploiting Mutually Exclusive
Explanations [118.0818807474809]
Abductive reasoning aims to find plausible explanations for an event.
Existing approaches for abductive reasoning in natural language processing often rely on manually generated annotations for supervision.
This work proposes an approach for abductive commonsense reasoning that exploits the fact that only a subset of explanations is correct for a given context.
arXiv Detail & Related papers (2023-05-24T01:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.