AR$^2$: Adversarial Reinforcement Learning for Abstract Reasoning in Large Language Models
- URL: http://arxiv.org/abs/2509.03537v1
- Date: Wed, 27 Aug 2025 17:26:44 GMT
- Title: AR$^2$: Adversarial Reinforcement Learning for Abstract Reasoning in Large Language Models
- Authors: Cheng-Kai Yeh, Hsing-Wang Lee, Chung-Hung Kuo, Hen-Hsen Huang,
- Abstract summary: We propose AR$2$ (Adversarial Reinforcement Learning for Abstract Reasoning), a novel framework explicitly designed to enhance the abstraction abilities of large language models (LLMs)<n>AR$2$ employs a teacher model to transform kernel problems into narrative-rich, challenging descriptions without changing their fundamental logic.<n>A student coding model is trained to solve these complex narrative problems by extracting their underlying computational kernels.
- Score: 12.484537674896908
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Abstraction--the ability to recognize and distill essential computational patterns from complex problem statements--is a foundational skill in computer science, critical both for human problem-solvers and coding-oriented large language models (LLMs). Despite recent advances in training LLMs for code generation using reinforcement learning (RL), most existing approaches focus primarily on superficial pattern recognition, overlooking explicit training for abstraction. In this study, we propose AR$^2$ (Adversarial Reinforcement Learning for Abstract Reasoning), a novel framework explicitly designed to enhance the abstraction abilities of LLMs. AR$^2$ employs a teacher model to transform kernel problems into narrative-rich, challenging descriptions without changing their fundamental logic. Simultaneously, a student coding model is trained to solve these complex narrative problems by extracting their underlying computational kernels. Experimental results demonstrate that AR$^2$ substantially improves the student model's accuracy on previously unseen, challenging programming tasks, underscoring abstraction as a key skill for enhancing LLM generalization.
Related papers
- Learning Abstractions for Hierarchical Planning in Program-Synthesis Agents [54.73952501784257]
Humans learn abstractions and use them to plan efficiently to quickly generalize across tasks.<n>We introduce TheoryCoder-2, a new large language model (LLM) agent that actively learns reusable abstractions.<n>We conduct experiments on diverse environments, including BabyAI, Minihack and VGDL games like Sokoban.
arXiv Detail & Related papers (2026-01-31T23:01:51Z) - Search-R3: Unifying Reasoning and Embedding Generation in Large Language Models [11.39711340224126]
Search-R3 is a novel framework that adapts Large Language Models to generate search embeddings as a direct output of their reasoning process.<n>Our approach exploits LLMs' chain-of-thought capabilities, allowing them to produce more effective embeddings by reasoning step-by-step through complex semantic analyses.
arXiv Detail & Related papers (2025-10-08T14:16:20Z) - RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems [98.98963933669751]
We train models to be capable of proposing multiple abstractions given a problem, followed by RL that incentivizes building a solution.<n>This results in a two-player RL training paradigm, abbreviated as RLAD, that jointly trains an abstraction generator and a solution generator.<n>We show that allocating more test-time compute to generating abstractions is more beneficial for performance than generating more solutions at large test budgets.
arXiv Detail & Related papers (2025-10-02T17:44:23Z) - Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z) - Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation Learning [19.75678229122211]
Large Language Models (LLMs) have achieved remarkable success across various domains.<n>They still face significant challenges, including high computational costs for training and limitations in solving complex reasoning problems.<n>We propose a novel framework that leverages graph learning to enable more flexible and adaptive reasoning capabilities.
arXiv Detail & Related papers (2025-05-09T02:51:22Z) - OpenVLThinker: Complex Vision-Language Reasoning via Iterative SFT-RL Cycles [91.88062410741833]
We introduce OpenVLThinker, one of the first open-source large vision-language models (LVLMs) to exhibit sophisticated chain-of-thought reasoning.<n>We show that OpenVLThinker-7B consistently advances performance across six benchmarks demanding mathematical and general reasoning.
arXiv Detail & Related papers (2025-03-21T17:52:43Z) - R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models.<n>Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start.<n>Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z) - LLM-based Cognitive Models of Students with Misconceptions [55.29525439159345]
This paper investigates whether Large Language Models (LLMs) can be instruction-tuned to meet this dual requirement.
We introduce MalAlgoPy, a novel Python library that generates datasets reflecting authentic student solution patterns.
Our insights enhance our understanding of AI-based student models and pave the way for effective adaptive learning systems.
arXiv Detail & Related papers (2024-10-16T06:51:09Z) - Building Minimal and Reusable Causal State Abstractions for
Reinforcement Learning [63.58935783293342]
Causal Bisimulation Modeling (CBM) is a method that learns the causal relationships in the dynamics and reward functions for each task to derive a minimal, task-specific abstraction.
CBM's learned implicit dynamics models identify the underlying causal relationships and state abstractions more accurately than explicit ones.
arXiv Detail & Related papers (2024-01-23T05:43:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.