Recursive Concept Evolution for Compositional Reasoning in Large Language Models
- URL: http://arxiv.org/abs/2602.15725v1
- Date: Tue, 17 Feb 2026 17:01:42 GMT
- Title: Recursive Concept Evolution for Compositional Reasoning in Large Language Models
- Authors: Sarim Chaudhry,
- Abstract summary: Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning.<n>We propose Recursive Concept Evolution (RCE), a framework that enables pretrained language models to modify their internal representation geometry during inference.<n>RCE yields 12-18 point gains on ARC-AGI-2, 8-14 point improvements on GPQA and BBH, and consistent reductions in depth-induced error on MATH and HLE.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning, including ARC-AGI-2, GPQA, MATH, BBH, and HLE. Existing methods improve reasoning by expanding token-level search through chain-of-thought prompting, self-consistency, or reinforcement learning, but they leave the model's latent representation space fixed. When the required abstraction is not already encoded in this space, performance collapses. We propose Recursive Concept Evolution (RCE), a framework that enables pretrained language models to modify their internal representation geometry during inference. RCE introduces dynamically generated low-rank concept subspaces that are spawned when representational inadequacy is detected, selected through a minimum description length criterion, merged when synergistic, and consolidated via constrained optimization to preserve stability. This process allows the model to construct new abstractions rather than recombining existing ones. We integrate RCE with Mistral-7B and evaluate it across compositional reasoning benchmarks. RCE yields 12-18 point gains on ARC-AGI-2, 8-14 point improvements on GPQA and BBH, and consistent reductions in depth-induced error on MATH and HLE.
Related papers
- Arbor: A Framework for Reliable Navigation of Critical Conversation Flows [0.19573380763700712]
We present Arbor, a framework that decomposes decision tree navigation into specialized, node-level tasks.<n>Abort improves mean turn accuracy by 29.4 percentage points, reduces per-turn latency by 57.1%, and an average 14.4x reduction in per-turn cost.
arXiv Detail & Related papers (2026-02-16T11:09:02Z) - Do Reasoning Models Enhance Embedding Models? [48.43242995118735]
State-of-the-art embedding models are increasingly derived from decoder-only Large Language Model backbones adapted via contrastive learning.<n>We show that embedding models from RLVR-tuned backbones yield no consistent performance advantage over their base counterparts when subjected to identical training recipes.
arXiv Detail & Related papers (2026-01-29T02:48:34Z) - PROMISE: Process Reward Models Unlock Test-Time Scaling Laws in Generative Recommendations [52.67948063133533]
Generative Recommendation has emerged as a promising paradigm, reformulating recommendation as a sequence-to-sequence generation task over hierarchical Semantic IDs.<n>Existing methods suffer from a critical issue we term Semantic Drift, where errors in early, high-level tokens irreversibly divert the generation trajectory into irrelevant semantic subspaces.<n>We propose Promise, a novel framework that integrates dense, step-by-step verification into generative models.
arXiv Detail & Related papers (2026-01-08T07:38:46Z) - Reasoning Pattern Alignment Merging for Adaptive Reasoning [48.347817456299104]
Reasoning Pattern Alignment Merging (RPAM)<n>RPAM is a layer-wise model merging framework based on feature alignment to facilitate query-adaptive reasoning.<n> Experiments on seven widely used reasoning benchmarks show that RPAM substantially reduces inference cost while maintaining strong performance.
arXiv Detail & Related papers (2026-01-07T01:36:39Z) - Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts [19.518525241726916]
Encode-Think-Decode (ETD) is a method that enhances the reasoning capabilities of a base model by training it to iterate over a small subset of reasoning-relevant layers during the mid-training stage.<n>ETD models yield substantial gains on 17 reasoning benchmarks, including +28.4% relative accuracy improvement on GSM8K and +36% on MATH with the OLMo-2 1B Base model.
arXiv Detail & Related papers (2025-10-08T15:58:35Z) - Enhancing Test-Time Scaling of Large Language Models with Hierarchical Retrieval-Augmented MCTS [19.394761422323853]
We introduce R2-LLMs, a novel and versatile hierarchical retrieval-augmented reasoning framework.<n>R2-LLMs enhances inference-time generalization by integrating dual-level retrieval-based in-context learning.<n> Empirical evaluations on the MATH500, GSM8K, and OlympiadBench-TO datasets achieve substantial relative improvement.
arXiv Detail & Related papers (2025-07-08T00:41:12Z) - Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Semantics-Aware Dynamic Localization and Refinement for Referring Image
Segmentation [102.25240608024063]
Referring image segments an image from a language expression.
We develop an algorithm that shifts from being localization-centric to segmentation-language.
Compared to its counterparts, our method is more versatile yet effective.
arXiv Detail & Related papers (2023-03-11T08:42:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.