Reward Design with Language Models
- URL: http://arxiv.org/abs/2303.00001v1
- Date: Mon, 27 Feb 2023 22:09:35 GMT
- Title: Reward Design with Language Models
- Authors: Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Abstract summary: Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require expert demonstrations.
Can we instead cheaply design rewards using a natural language interface?
This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function.
- Score: 27.24197025688919
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reward design in reinforcement learning (RL) is challenging since specifying
human notions of desired behavior may be difficult via reward functions or
require many expert demonstrations. Can we instead cheaply design rewards using
a natural language interface? This paper explores how to simplify reward design
by prompting a large language model (LLM) such as GPT-3 as a proxy reward
function, where the user provides a textual prompt containing a few examples
(few-shot) or a description (zero-shot) of the desired behavior. Our approach
leverages this proxy reward function in an RL framework. Specifically, users
specify a prompt once at the beginning of training. During training, the LLM
evaluates an RL agent's behavior against the desired behavior described by the
prompt and outputs a corresponding reward signal. The RL agent then uses this
reward to update its behavior. We evaluate whether our approach can train
agents aligned with user objectives in the Ultimatum Game, matrix games, and
the DealOrNoDeal negotiation task. In all three tasks, we show that RL agents
trained with our framework are well-aligned with the user's objectives and
outperform RL agents trained with reward functions learned via supervised
learning
Related papers
- Exploring RL-based LLM Training for Formal Language Tasks with Programmed Rewards [49.7719149179179]
This paper investigates the feasibility of using PPO for reinforcement learning (RL) from explicitly programmed reward signals.
We focus on tasks expressed through formal languages, such as programming, where explicit reward functions can be programmed to automatically assess quality of generated outputs.
Our results show that pure RL-based training for the two formal language tasks is challenging, with success being limited even for the simple arithmetic task.
arXiv Detail & Related papers (2024-10-22T15:59:58Z) - A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning [25.82540393199001]
CARD is a Reward Design framework that iteratively generates and improves reward function code.
CARD includes a Coder that generates and verifies the code, while a Evaluator provides dynamic feedback to guide the Coder in improving the code.
arXiv Detail & Related papers (2024-10-18T17:51:51Z) - OCALM: Object-Centric Assessment with Language Models [33.10137796492542]
We propose Object-Centric Assessment with Language Models (OCALM) to derive inherently interpretable reward functions for reinforcement learning agents.
OCALM uses the extensive world-knowledge of language models to derive reward functions focused on relational concepts.
arXiv Detail & Related papers (2024-06-24T15:57:48Z) - FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning [18.60627708199452]
We investigate how to leverage pre-trained visual-language models (VLM) for online Reinforcement Learning (RL)
We first identify the problem of reward misalignment when applying VLM as a reward in RL tasks.
We introduce a lightweight fine-tuning method, named Fuzzy VLM reward-aided RL (FuRL)
arXiv Detail & Related papers (2024-06-02T07:20:08Z) - RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback [24.759613248409167]
Reward engineering has long been a challenge in Reinforcement Learning research.
We propose RL-VLM-F, a method that automatically generates reward functions for agents to learn new tasks.
We demonstrate that RL-VLM-F successfully produces effective rewards and policies across various domains.
arXiv Detail & Related papers (2024-02-06T04:06:06Z) - Deep Reinforcement Learning from Hierarchical Preference Design [99.46415116087259]
This paper shows by exploiting certain structures, one can ease the reward design process.
We propose a hierarchical reward modeling framework -- HERON for scenarios: (I) The feedback signals naturally present hierarchy; (II) The reward is sparse, but with less important surrogate feedback to help policy learning.
arXiv Detail & Related papers (2023-09-06T00:44:29Z) - Iterative Reward Shaping using Human Feedback for Correcting Reward
Misspecification [15.453123084827089]
ITERS is an iterative reward shaping approach using human feedback for mitigating the effects of a misspecified reward function.
We evaluate ITERS in three environments and show that it can successfully correct misspecified reward functions.
arXiv Detail & Related papers (2023-08-30T11:45:40Z) - Language Reward Modulation for Pretraining Reinforcement Learning [61.76572261146311]
We propose leveraging the capabilities of LRFs as a pretraining signal for reinforcement learning.
Our VLM pretraining approach, which is a departure from previous attempts to use LRFs, can warmstart sample-efficient learning on robot manipulation tasks.
arXiv Detail & Related papers (2023-08-23T17:37:51Z) - Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning
for Task-oriented Dialogue Systems [111.80916118530398]
reinforcement learning (RL) techniques can naturally be utilized to train dialogue strategies to achieve user-specific goals.
This paper aims at answering the question of how to efficiently learn and leverage a reward function for training end-to-end (E2E) ToD agents.
arXiv Detail & Related papers (2023-02-20T22:10:04Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.