Related papers: LLM-Empowered State Representation for Reinforcement Learning

LLM-Empowered State Representation for Reinforcement Learning

URL: http://arxiv.org/abs/2407.13237v1
Date: Thu, 18 Jul 2024 07:47:51 GMT
Title: LLM-Empowered State Representation for Reinforcement Learning
Authors: Boyuan Wang, Yun Qu, Yuhang Jiang, Jianzhun Shao, Chang Liu, Wenming Yang, Xiangyang Ji,
Abstract summary: State representations in reinforcement learning often omit critical task-related details. We propose LLM-Empowered State Representation (LESR), a novel approach that utilizes LLM to autonomously generate task-related state representation codes. LESR exhibits high sample efficiency and outperforms state-of-the-art baselines by an average of 29% in accumulated reward in Mujoco tasks and 30% in success rates in Gym-Robotics tasks.
Score: 64.3351150030341
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conventional state representations in reinforcement learning often omit critical task-related details, presenting a significant challenge for value networks in establishing accurate mappings from states to task rewards. Traditional methods typically depend on extensive sample learning to enrich state representations with task-specific information, which leads to low sample efficiency and high time costs. Recently, surging knowledgeable large language models (LLM) have provided promising substitutes for prior injection with minimal human intervention. Motivated by this, we propose LLM-Empowered State Representation (LESR), a novel approach that utilizes LLM to autonomously generate task-related state representation codes which help to enhance the continuity of network mappings and facilitate efficient training. Experimental results demonstrate LESR exhibits high sample efficiency and outperforms state-of-the-art baselines by an average of 29% in accumulated reward in Mujoco tasks and 30% in success rates in Gym-Robotics tasks.

Related papers

Enhancing Cross-task Transfer of Large Language Models via Activation Steering [75.41750053623298]
Cross-task in-context learning offers a direct solution for transferring knowledge across tasks.<n>We investigate whether cross-task transfer can be achieved via latent space steering without parameter updates or input expansion.<n>We propose a novel Cross-task Activation Steering Transfer framework that enables effective transfer by manipulating the model's internal activation states.
arXiv Detail & Related papers (2025-07-17T15:47:22Z)
Real-Time Verification of Embodied Reasoning for Generative Skill Acquisition [47.068088124436535]
Generative skill acquisition enables embodied agents to actively learn a scalable and evolving repertoire of control skills.<n>We propose VERGSA, a framework that systematically integrates real-time verification principles into embodied skill learning.<n>To the best of our knowledge, this approach constitutes the first comprehensive training dataset for verification-driven generative skill acquisition.
arXiv Detail & Related papers (2025-05-16T12:19:13Z)
Efficient Knowledge Transfer in Multi-Task Learning through Task-Adaptive Low-Rank Representation [11.955971931186006]
Pre-trained language models struggle with emerging tasks unseen during training in real-world applications.<n>We propose Task-Adaptive Low-Rank Representation (TA-LoRA), an MTL method built on prompt tuning.<n>Experiments on 16 tasks demonstrate that TA-LoRA achieves state-of-the-art performance in full-data and few-shot settings.
arXiv Detail & Related papers (2025-04-20T06:33:19Z)
GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill [44.95563610228887]
Learning open-vocabulary physical skills for simulated agents presents a significant challenge in artificial intelligence. We introduce GROVE, a generalized reward framework that enables open-vocabulary physical skill learning without manual engineering or task-specific demonstrations. To bridge the domain gap between simulation and natural images, we develop Pose2CLIP, a lightweight mapper that efficiently projects agent poses directly into semantic feature space.
arXiv Detail & Related papers (2025-04-05T14:44:47Z)
Underlying Semantic Diffusion for Effective and Efficient In-Context Learning [113.4003355229632]
Underlying Semantic Diffusion (US-Diffusion) is an enhanced diffusion model that boosts underlying semantics learning, computational efficiency, and in-context learning capabilities. We present a Feedback-Aided Learning (FAL) framework, which leverages feedback signals to guide the model in capturing semantic details. We also propose a plug-and-play Efficient Sampling Strategy (ESS) for dense sampling at time steps with high-noise levels.
arXiv Detail & Related papers (2025-03-06T03:06:22Z)
Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks [4.374837991804085]
Task-Aware Virtual Training (TAVT) is a novel algorithm that captures task characteristics for both training and out-of-distribution (OOD) scenarios. Numerical results demonstrate that TAVT significantly enhances generalization to OOD tasks across various MuJoCo and MetaWorld environments.
arXiv Detail & Related papers (2025-02-05T02:31:50Z)
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration [46.938186139700804]
This paper introduces LEMAE, choosing to channel informative task-relevant guidance from a knowledgeable Large Language Model (LLM) for Efficient Multi-Agent Exploration. Specifically, we ground linguistic knowledge from LLM into symbolic key states, that are critical for task fulfillment, in a discriminative manner at low inference costs. Benefiting from diminishing redundant explorations, LEMAE outperforms existing SOTA approaches on the challenging benchmarks (e.g., SMAC and MPE) by a large margin, achieving a 10x acceleration in certain scenarios.
arXiv Detail & Related papers (2024-10-03T14:21:23Z)
Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks [22.66167973623777]
Large Language Models (LLMs) have transformed NLP with their remarkable In-context Learning (ICL) capabilities. This paper investigates whether LLMs can generalize from labeled examples of predefined tasks to novel tasks. We show that cross-task prompting leads to a remarkable performance boost of 107% for LLaMA-2 7B, 18.6% for LLaMA-2 13B, and 3.2% for GPT 3.5 on average over zero-shot prompting.
arXiv Detail & Related papers (2024-05-17T05:20:49Z)
Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning [17.092640837991883]
Reinforcement learning (RL) presents a promising framework to learn policies through environment interaction. One direction includes augmenting RL with offline data demonstrating desired tasks, but past work often require a lot of high-quality demonstration data. We show how the combination of a reverse curriculum and forward curriculum in our method, RFCL, enables significant improvements in demonstration and sample efficiency.
arXiv Detail & Related papers (2024-05-06T11:33:12Z)
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety. Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs. We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z)
Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA) SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z)
Provable Benefits of Representational Transfer in Reinforcement Learning [59.712501044999875]
We study the problem of representational transfer in RL, where an agent first pretrains in a number of source tasks to discover a shared representation. We show that given generative access to source tasks, we can discover a representation, using which subsequent linear RL techniques quickly converge to a near-optimal policy.
arXiv Detail & Related papers (2022-05-29T04:31:29Z)
Mask-based Latent Reconstruction for Reinforcement Learning [58.43247393611453]
Mask-based Latent Reconstruction (MLR) is proposed to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels. Extensive experiments show that our MLR significantly improves the sample efficiency in deep reinforcement learning.
arXiv Detail & Related papers (2022-01-28T13:07:11Z)
Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning [3.308743964406687]
$k$-Step Latent (KSL) is a representation learning method that enforces temporal consistency of representations. KSL produces encoders that generalize better to new tasks unseen during training.
arXiv Detail & Related papers (2021-10-11T00:16:43Z)
Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states. VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.