Symbol Guided Hindsight Priors for Reward Learning from Human
Preferences
- URL: http://arxiv.org/abs/2210.09151v2
- Date: Wed, 19 Oct 2022 14:21:53 GMT
- Title: Symbol Guided Hindsight Priors for Reward Learning from Human
Preferences
- Authors: Mudit Verma and Katherine Metcalf
- Abstract summary: We present the PRIor Over Rewards (PRIOR) framework, which incorporates priors about the structure of the reward function and the preference feedback into the reward learning process.
We demonstrate that using an abstract state space for the computation of the priors further improves the reward learning and the agent's performance.
- Score: 2.512827436728378
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Specifying rewards for reinforcement learned (RL) agents is challenging.
Preference-based RL (PbRL) mitigates these challenges by inferring a reward
from feedback over sets of trajectories. However, the effectiveness of PbRL is
limited by the amount of feedback needed to reliably recover the structure of
the target reward. We present the PRIor Over Rewards (PRIOR) framework, which
incorporates priors about the structure of the reward function and the
preference feedback into the reward learning process. Imposing these priors as
soft constraints on the reward learning objective reduces the amount of
feedback required by half and improves overall reward recovery. Additionally,
we demonstrate that using an abstract state space for the computation of the
priors further improves the reward learning and the agent's performance.
Related papers
- Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning [44.770495418026734]
Reinforcement Learning (RL) empowers agents to acquire various skills by learning from reward signals.
Traditional methods assume the existence of underlying Markovian rewards and that the observed delayed reward is simply the sum of instance-level rewards.
We propose Composite Delayed Reward Transformer (CoDeTr), which incorporates a specialized in-sequence attention mechanism.
arXiv Detail & Related papers (2024-10-26T13:12:27Z) - Listwise Reward Estimation for Offline Preference-based Reinforcement Learning [20.151932308777553]
Listwise Reward Estimation (LiRE) is a novel approach for offline Preference-based Reinforcement Learning (PbRL)
LiRE builds on existing PbRL methods by constructing a Ranked List of Trajectories (RLT)
Our experiments demonstrate the superiority of LiRE, even with modest feedback budgets and enjoying robustness with respect to the number of feedbacks and feedback noise.
arXiv Detail & Related papers (2024-08-08T03:18:42Z) - Hindsight PRIORs for Reward Learning from Human Preferences [3.4990427823966828]
Preference based Reinforcement Learning (PbRL) removes the need to hand specify a reward function by learning a reward from preference feedback over policy behaviors.
Current approaches to PbRL do not address the credit assignment problem inherent in determining which parts of a behavior most contributed to a preference.
We introduce a credit assignment strategy (Hindsight PRIOR) that uses a world model to approximate state importance within a trajectory and then guides rewards to be proportional to state importance.
arXiv Detail & Related papers (2024-04-12T21:59:42Z) - Dense Reward for Free in Reinforcement Learning from Human Feedback [64.92448888346125]
We leverage the fact that the reward model contains more information than just its scalar output.
We use these attention weights to redistribute the reward along the whole completion.
Empirically, we show that it stabilises training, accelerates the rate of learning, and, in practical cases, may lead to better local optima.
arXiv Detail & Related papers (2024-02-01T17:10:35Z) - REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and user intentions, values, or social norms can be catastrophic in the real world.
Current methods to mitigate this misalignment work by learning reward functions from human preferences.
We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - Deep Reinforcement Learning from Hierarchical Preference Design [99.46415116087259]
This paper shows by exploiting certain structures, one can ease the reward design process.
We propose a hierarchical reward modeling framework -- HERON for scenarios: (I) The feedback signals naturally present hierarchy; (II) The reward is sparse, but with less important surrogate feedback to help policy learning.
arXiv Detail & Related papers (2023-09-06T00:44:29Z) - A State Augmentation based approach to Reinforcement Learning from Human
Preferences [20.13307800821161]
Preference Based Reinforcement Learning attempts to solve the issue by utilizing binary feedbacks on queried trajectory pairs.
We present a state augmentation technique that allows the agent's reward model to be robust.
arXiv Detail & Related papers (2023-02-17T07:10:50Z) - Reward Uncertainty for Exploration in Preference-based Reinforcement
Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms.
Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward.
Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z) - Information Directed Reward Learning for Reinforcement Learning [64.33774245655401]
We learn a model of the reward function that allows standard RL algorithms to achieve high expected return with as few expert queries as possible.
In contrast to prior active reward learning methods designed for specific types of queries, IDRL naturally accommodates different query types.
We support our findings with extensive evaluations in multiple environments and with different types of queries.
arXiv Detail & Related papers (2021-02-24T18:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.