Figure It Out: Improve the Frontier of Reasoning with Executable Visual States
- URL: http://arxiv.org/abs/2512.24297v2
- Date: Tue, 06 Jan 2026 13:22:08 GMT
- Title: Figure It Out: Improve the Frontier of Reasoning with Executable Visual States
- Authors: Meiqi Chen, Fandong Meng, Jie Zhou,
- Abstract summary: Complex reasoning problems often involve implicit spatial and geometric relationships that are not explicitly encoded in text.<n>We introduce FIGR, which integrates executable visual construction into multi-turn reasoning via end-to-end reinforcement learning.<n>Experiments on eight challenging mathematical benchmarks demonstrate that FIGR outperforms strong text-only chain-of-thought baselines.
- Score: 53.77871196174248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Complex reasoning problems often involve implicit spatial and geometric relationships that are not explicitly encoded in text. While recent reasoning models perform well across many domains, purely text-based reasoning struggles to capture structural constraints in complex settings. In this paper, we introduce FIGR, which integrates executable visual construction into multi-turn reasoning via end-to-end reinforcement learning. Rather than relying solely on textual chains of thought, FIGR externalizes intermediate hypotheses by generating executable code that constructs diagrams within the reasoning loop. An adaptive reward mechanism selectively regulates when visual construction is invoked, enabling more consistent reasoning over latent global properties that are difficult to infer from text alone. Experiments on eight challenging mathematical benchmarks demonstrate that FIGR outperforms strong text-only chain-of-thought baselines, improving the base model by 13.12% on AIME 2025 and 11.00% on BeyondAIME. These results highlight the effectiveness of precise, controllable figure construction of FIGR in enhancing complex reasoning ability.
Related papers
- How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning [17.18771466838129]
Solving complex geometric problems inherently requires interleaved reasoning.<n>We argue thatSupervised Fine-Tuning (SFT) on interleaved plot-solution data leads to a substantial degradation in reasoning performance.<n>We propose Faire, a reinforcement learning framework that enforces three casual constraints to move beyond superficial imitation.
arXiv Detail & Related papers (2026-03-01T12:18:12Z) - CoT-Seg: Rethinking Segmentation with Chain-of-Thought Reasoning and Self-Correction [50.67483317563736]
This paper aims to explore a system that can think step-by-step, look up information if needed, generate results, self-evaluate its own results, and refine the results.<n>We introduce CoT-Seg, a training-free framework that rethinks reasoning segmentation by combining chain-of-thought reasoning with self-correction.
arXiv Detail & Related papers (2026-01-24T11:41:54Z) - Implicit Reasoning in Large Language Models: A Comprehensive Survey [67.53966514728383]
Large Language Models (LLMs) have demonstrated strong generalization across a wide range of tasks.<n>Recent studies have shifted attention from explicit chain-of-thought prompting toward implicit reasoning.<n>This survey introduces a taxonomy centered on execution paradigms, shifting the focus from representational forms to computational strategies.
arXiv Detail & Related papers (2025-09-02T14:16:02Z) - A Survey on Latent Reasoning [100.54120559169735]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities.<n>CoT reasoning that verbalizes intermediate steps limits the model's expressive bandwidth.<n>Latent reasoning tackles this bottleneck by performing multi-step inference entirely in the model's continuous hidden state.
arXiv Detail & Related papers (2025-07-08T17:29:07Z) - PixelThink: Towards Efficient Chain-of-Pixel Reasoning [70.32510083790069]
PixelThink is a simple yet effective scheme that integrates externally estimated task difficulty and internally measured model uncertainty.<n>It learns to compress reasoning length in accordance with scene complexity and predictive confidence.<n> Experimental results demonstrate that the proposed approach improves both reasoning efficiency and overall segmentation performance.
arXiv Detail & Related papers (2025-05-29T17:55:49Z) - Visualizing Thought: Conceptual Diagrams Enable Robust Planning in LMMs [59.66595230543127]
Conceptual diagrams externalize mental models, abstracting irrelevant details to efficiently capture how entities interact.<n>Large Language Models (LLMs) and Large MultiModal Models (LMMs) predominantly reason through text.<n>We propose Visual Thinking, a generalizable framework that enables LMMs to reason through multiple chains of self-generated conceptual diagrams.
arXiv Detail & Related papers (2025-03-14T18:27:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.