Is my model "mind blurting"? Interpreting the dynamics of reasoning tokens with Recurrence Quantification Analysis (RQA)
- URL: http://arxiv.org/abs/2602.06266v1
- Date: Thu, 05 Feb 2026 23:48:23 GMT
- Title: Is my model "mind blurting"? Interpreting the dynamics of reasoning tokens with Recurrence Quantification Analysis (RQA)
- Authors: Quoc Tuan Pham, Mehdi Jafari, Flora Salim,
- Abstract summary: We propose Recurrence Quantification Analysis (RQA) as a non-textual alternative for analysing model's reasoning chains at test time.<n>RQA captures signals not reflected by response length, but also substantially improves prediction of task complexity by 8%.
- Score: 1.593065406609169
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Test-time compute is central to large reasoning models, yet analysing their reasoning behaviour through generated text is increasingly impractical and unreliable. Response length is often used as a brute proxy for reasoning effort, but this metric fails to capture the dynamics and effectiveness of the Chain of Thoughts (CoT) or the generated tokens. We propose Recurrence Quantification Analysis (RQA) as a non-textual alternative for analysing model's reasoning chains at test time. By treating token generation as a dynamical system, we extract hidden embeddings at each generation step and apply RQA to the resulting trajectories. RQA metrics, including Determinism and Laminarity, quantify patterns of repetition and stalling in the model's latent representations. Analysing 3,600 generation traces from DeepSeek-R1-Distill, we show that RQA captures signals not reflected by response length, but also substantially improves prediction of task complexity by 8\%. These results help establish RQA as a principled tool for studying the latent token generation dynamics of test-time scaling in reasoning models.
Related papers
- APR: Penalizing Structural Redundancy in Large Reasoning Models via Anchor-based Process Rewards [61.52322047892064]
Test-Time Scaling (TTS) has significantly enhanced the capabilities of Large Reasoning Models (LRMs)<n>We observe that LRMs frequently conduct repetitive self-verification without revision even after obtaining the final answer during the reasoning process.<n>We propose Anchor-based Process Reward (APR), a structure-aware reward shaping method that localizes the reasoning anchor and penalizes exclusively the post-anchor AST.
arXiv Detail & Related papers (2026-01-31T14:53:20Z) - Making Mathematical Reasoning Adaptive [61.45161826629692]
We propose the AdaR framework to enable adaptive reasoning in large language models (LLMs)<n>AdaR synthesizes logically equivalent queries by varying variable values, and trains models with RLVR on these data to penalize spurious logic.<n> Experimental results demonstrate that AdaR improves robustness and generalization, achieving substantial improvement in mathematical reasoning.
arXiv Detail & Related papers (2025-10-06T09:30:05Z) - Rethinking Reasoning Quality in Large Language Models through Enhanced Chain-of-Thought via RL [19.659532349434418]
Reinforcement learning (RL) has recently become the dominant paradigm for strengthening the reasoning abilities of large language models.<n>Yet the rule-based reward functions commonly used on mathematical or programming benchmarks assess only answer format and correctness.<n>We propose Dynamic Reasoning Efficiency Reward (DRER) -- a plug-and-play RL reward framework that reshapes both reward and advantage signals.
arXiv Detail & Related papers (2025-09-07T11:52:18Z) - Inducing Faithfulness in Structured Reasoning via Counterfactual Sensitivity [6.908972852063454]
Large language models often generate a correct answer while relying on a flawed or irrelevant reasoning trace.<n>This paper introduces textbfCounterfactual Sensitivity Regularization (CSR), a novel training objective.<n>CSR improves faithfulness over standard fine-tuning and process supervision by up to 70 percentage points.
arXiv Detail & Related papers (2025-09-01T15:18:46Z) - SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models [2.7645012830234]
Large reasoning models excel at complex reasoning tasks but typically generate lengthy sequential chains-of-thought.<n>SPRINT is a novel post-training and inference-time framework designed to enable LRMs to dynamically identify and exploit opportunities for parallelization.<n>We show that the models fine-tuned with the SPRINT framework match the performance of reasoning models on complex domains such as mathematics.
arXiv Detail & Related papers (2025-06-06T05:10:31Z) - Accelerated Test-Time Scaling with Model-Free Speculative Sampling [58.69141724095398]
We introduce STAND (STochastic Adaptive N-gram Drafting), a novel model-free speculative decoding approach.<n>We show that STAND reduces inference latency by 60-65% compared to standard autoregressive decoding.<n>As a model-free approach, STAND can be applied to any existing language model without additional training.
arXiv Detail & Related papers (2025-06-05T07:31:18Z) - S-GRPO: Early Exit via Reinforcement Learning in Reasoning Models [2.9925837108958864]
Test-Time Scaling emerges as an active research focus in the large language model community.<n>Recent studies reveal that reasoning models (even Qwen3) consistently exhibit excessive thought redundancy.<n>This paper introduces Serial-Group Decaying-Reward Policy Optimization (S-GRPO), a novel reinforcement learning paradigm.
arXiv Detail & Related papers (2025-05-12T15:50:44Z) - Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models.<n>For the first time, that order of generation can successfully be predicted by the denoising network itself.<n>Using these findings, we can increase the accuracy of specific reasoning tasks from 1% to >50%.
arXiv Detail & Related papers (2025-02-28T14:08:30Z) - Chain-of-Retrieval Augmented Generation [91.02950964802454]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - Deep Generative Symbolic Regression [83.04219479605801]
Symbolic regression aims to discover concise closed-form mathematical equations from data.
Existing methods, ranging from search to reinforcement learning, fail to scale with the number of input variables.
We propose an instantiation of our framework, Deep Generative Symbolic Regression.
arXiv Detail & Related papers (2023-12-30T17:05:31Z) - Automatically Generating Counterfactuals for Relation Exaction [18.740447044960796]
relation extraction (RE) is a fundamental task in natural language processing.
Current deep neural models have achieved high accuracy but are easily affected by spurious correlations.
We develop a novel approach to derive contextual counterfactuals for entities.
arXiv Detail & Related papers (2022-02-22T04:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.