Real-Time Verification of Embodied Reasoning for Generative Skill Acquisition
- URL: http://arxiv.org/abs/2505.11175v2
- Date: Mon, 19 May 2025 05:14:55 GMT
- Title: Real-Time Verification of Embodied Reasoning for Generative Skill Acquisition
- Authors: Bo Yue, Shuqi Guo, Kaiyu Hu, Chujiao Wang, Benyou Wang, Kui Jia, Guiliang Liu,
- Abstract summary: Generative skill acquisition enables embodied agents to actively learn a scalable and evolving repertoire of control skills.<n>We propose VERGSA, a framework that systematically integrates real-time verification principles into embodied skill learning.<n>To the best of our knowledge, this approach constitutes the first comprehensive training dataset for verification-driven generative skill acquisition.
- Score: 47.068088124436535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative skill acquisition enables embodied agents to actively learn a scalable and evolving repertoire of control skills, crucial for the advancement of large decision models. While prior approaches often rely on supervision signals from generalist agents (e.g., LLMs), their effectiveness in complex 3D environments remains unclear; exhaustive evaluation incurs substantial computational costs, significantly hindering the efficiency of skill learning. Inspired by recent successes in verification models for mathematical reasoning, we propose VERGSA (Verifying Embodied Reasoning in Generative Skill Acquisition), a framework that systematically integrates real-time verification principles into embodied skill learning. VERGSA establishes 1) a seamless extension from verification of mathematical reasoning into embodied learning by dynamically incorporating contextually relevant tasks into prompts and defining success metrics for both subtasks and overall tasks, and 2) an automated, scalable reward labeling scheme that synthesizes dense reward signals by iteratively finalizing the contribution of scene configuration and subtask learning to overall skill acquisition. To the best of our knowledge, this approach constitutes the first comprehensive training dataset for verification-driven generative skill acquisition, eliminating arduous manual reward engineering. Experiments validate the efficacy of our approach: 1) the exemplar task pool improves the average task success rates by 21%, 2) our verification model boosts success rates by 24% for novel tasks and 36% for encountered tasks, and 3) outperforms LLM-as-a-Judge baselines in verification quality.
Related papers
- Knowledge capture, adaptation and composition (KCAC): A framework for cross-task curriculum learning in robotic manipulation [6.683222869973898]
Reinforcement learning (RL) has demonstrated remarkable potential in robotic manipulation but faces challenges in sample inefficiency and lack of interpretability.<n>This paper proposes a Knowledge Capture, Adaptation, and Composition framework to integrate knowledge transfer into RL through cross-task curriculum learning.<n>As a result, our KCAC approach achieves a 40 percent reduction in training time while improving task success rates by 10 percent compared to traditional RL methods.
arXiv Detail & Related papers (2025-05-15T17:30:29Z) - GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill [44.95563610228887]
Learning open-vocabulary physical skills for simulated agents presents a significant challenge in artificial intelligence.<n>We introduce GROVE, a generalized reward framework that enables open-vocabulary physical skill learning without manual engineering or task-specific demonstrations.<n>To bridge the domain gap between simulation and natural images, we develop Pose2CLIP, a lightweight mapper that efficiently projects agent poses directly into semantic feature space.
arXiv Detail & Related papers (2025-04-05T14:44:47Z) - ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification [53.80183105328448]
Refine via Intrinsic Self-Verification (ReVISE) is an efficient framework that enables LLMs to self-correct their outputs through self-verification.<n>Our experiments on various reasoning tasks demonstrate that ReVISE achieves efficient self-correction and significantly improves reasoning performance.
arXiv Detail & Related papers (2025-02-20T13:50:02Z) - Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives [54.14429346914995]
Chain-of-Thought (CoT) has become a pivotal method for solving complex problems.
Large language models (LLMs) often struggle to accurately decompose domain-specific tasks.
This paper introduces the Re-TASK framework, a novel theoretical model that revisits LLM tasks from the perspectives of capability, skill, and knowledge.
arXiv Detail & Related papers (2024-08-13T13:58:23Z) - LLM-Empowered State Representation for Reinforcement Learning [64.3351150030341]
State representations in reinforcement learning often omit critical task-related details.
We propose LLM-Empowered State Representation (LESR), a novel approach that utilizes LLM to autonomously generate task-related state representation codes.
LESR exhibits high sample efficiency and outperforms state-of-the-art baselines by an average of 29% in accumulated reward in Mujoco tasks and 30% in success rates in Gym-Robotics tasks.
arXiv Detail & Related papers (2024-07-18T07:47:51Z) - Variational Curriculum Reinforcement Learning for Unsupervised Discovery
of Skills [25.326624139426514]
We propose a novel approach to unsupervised skill discovery based on information theory, called Value Uncertainty Vari Curriculum Curriculum (VUVC)
We prove that, under regularity conditions, VUVC accelerates the increase of entropy in the visited states compared to the uniform curriculum.
We also demonstrate that the skills discovered by our method successfully complete a real-world robot navigation task in a zero-shot setup.
arXiv Detail & Related papers (2023-10-30T10:34:25Z) - Human-Timescale Adaptation in an Open-Ended Task Space [56.55530165036327]
We show that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans.
Our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains.
arXiv Detail & Related papers (2023-01-18T15:39:21Z) - Task-Agnostic Continual Reinforcement Learning: Gaining Insights and
Overcoming Challenges [27.474011433615317]
Continual learning (CL) enables the development of models and agents that learn from a sequence of tasks.
We investigate the factors that contribute to the performance differences between task-agnostic CL and multi-task (MTL) agents.
arXiv Detail & Related papers (2022-05-28T17:59:00Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks [59.761411682238645]
Retrieval-augmented generation models have shown state-of-the-art performance across many knowledge-intensive NLP tasks.
We introduce a method to incorporate evidentiality of passages -- whether a passage contains correct evidence to support the output -- into training the generator.
arXiv Detail & Related papers (2021-12-16T08:18:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.