Preference Transformer: Modeling Human Preferences using Transformers
for RL
- URL: http://arxiv.org/abs/2303.00957v1
- Date: Thu, 2 Mar 2023 04:24:29 GMT
- Title: Preference Transformer: Modeling Human Preferences using Transformers
for RL
- Authors: Changyeon Kim, Jongjin Park, Jinwoo Shin, Honglak Lee, Pieter Abbeel,
Kimin Lee
- Abstract summary: Preference Transformer is a neural architecture that models human preferences using transformers.
We show that Preference Transformer can solve a variety of control tasks using real human preferences, while prior approaches fail to work.
- Score: 165.33887165572128
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Preference-based reinforcement learning (RL) provides a framework to train
agents using human preferences between two behaviors. However, preference-based
RL has been challenging to scale since it requires a large amount of human
feedback to learn a reward function aligned with human intent. In this paper,
we present Preference Transformer, a neural architecture that models human
preferences using transformers. Unlike prior approaches assuming human judgment
is based on the Markovian rewards which contribute to the decision equally, we
introduce a new preference model based on the weighted sum of non-Markovian
rewards. We then design the proposed preference model using a transformer
architecture that stacks causal and bidirectional self-attention layers. We
demonstrate that Preference Transformer can solve a variety of control tasks
using real human preferences, while prior approaches fail to work. We also show
that Preference Transformer can induce a well-specified reward and attend to
critical events in the trajectory by automatically capturing the temporal
dependencies in human decision-making. Code is available on the project
website: https://sites.google.com/view/preference-transformer.
Related papers
- Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback [87.37721254914476]
We introduce a routing framework that combines inputs from humans and LMs to achieve better annotation quality.
We train a performance prediction model to predict a reward model's performance on an arbitrary combination of human and LM annotations.
We show that the selected hybrid mixture achieves better reward model performance compared to using either one exclusively.
arXiv Detail & Related papers (2024-10-24T20:04:15Z) - LRHP: Learning Representations for Human Preferences via Preference Pairs [45.056558199304554]
We introduce a preference representation learning task that aims to construct a richer and more structured representation of human preferences.
We verify the utility of preference representations in two downstream tasks: preference data selection and preference margin prediction.
arXiv Detail & Related papers (2024-10-06T14:48:28Z) - AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable
Diffusion Model [69.12623428463573]
AlignDiff is a novel framework to quantify human preferences, covering abstractness, and guide diffusion planning.
It can accurately match user-customized behaviors and efficiently switch from one to another.
We demonstrate its superior performance on preference matching, switching, and covering compared to other baselines.
arXiv Detail & Related papers (2023-10-03T13:53:08Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Models of human preference for learning reward functions [80.39289349661364]
We learn the reward function from human-generated preferences between pairs of trajectory segments.
We find this assumption to be flawed and propose modeling human preferences as informed by each segment's regret.
Our proposed regret preference model better predicts real human preferences and also learns reward functions from these preferences that lead to policies that are better human-aligned.
arXiv Detail & Related papers (2022-06-05T17:58:02Z) - TransDreamer: Reinforcement Learning with Transformer World Models [30.387428559614186]
We propose a transformer-based Model-Based Reinforcement Learning agent, called TransDreamer.
We first introduce the Transformer State-Space Model, a world model that leverages a transformer for dynamics predictions. We then share this world model with a transformer-based policy network and obtain stability in training a transformer-based RL agent.
In experiments, we apply the proposed model to 2D visual RL and 3D first-person visual RL tasks both requiring long-range memory access for memory-based reasoning. We show that the proposed model outperforms Dreamer in these complex tasks.
arXiv Detail & Related papers (2022-02-19T00:30:52Z) - Decision Transformer: Reinforcement Learning via Sequence Modeling [102.86873656751489]
We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem.
We present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
arXiv Detail & Related papers (2021-06-02T17:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.