Related papers: Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning

URL: http://arxiv.org/abs/2203.07475v2
Date: Wed, 7 Jun 2023 04:37:57 GMT
Title: Invariance in Policy Optimisation and Partial Identifiability in Reward Learning
Authors: Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave
Abstract summary: We characterise the partial identifiability of the reward function given popular reward learning data sources. We also analyse the impact of this partial identifiability for several downstream tasks, such as policy optimisation.
Score: 67.4640841144101
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is often very challenging to manually design reward functions for complex, real-world tasks. To solve this, one can instead use reward learning to infer a reward function from data. However, there are often multiple reward functions that fit the data equally well, even in the infinite-data limit. This means that the reward function is only partially identifiable. In this work, we formally characterise the partial identifiability of the reward function given several popular reward learning data sources, including expert demonstrations and trajectory comparisons. We also analyse the impact of this partial identifiability for several downstream tasks, such as policy optimisation. We unify our results in a framework for comparing data sources and downstream tasks by their invariances, with implications for the design and selection of data sources for reward learning.

Related papers

Partial Identifiability and Misspecification in Inverse Reinforcement Learning [64.13583792391783]
The aim of Inverse Reinforcement Learning is to infer a reward function $R$ from a policy $pi$. This paper provides a comprehensive analysis of partial identifiability and misspecification in IRL.
arXiv Detail & Related papers (2024-11-24T18:35:46Z)
The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret [64.04721528586747]
We show that a sufficiently low expected test error of the reward model guarantees low worst-case regret. We then show that similar problems persist even when using policy regularization techniques.
arXiv Detail & Related papers (2024-06-22T06:43:51Z)
Automated Feature Selection for Inverse Reinforcement Learning [7.278033100480175]
Inverse reinforcement learning (IRL) is an imitation learning approach to learning reward functions from expert demonstrations. We propose a method that employs basis functions to form a candidate set of features. We demonstrate the approach's effectiveness by recovering reward functions that capture expert policies.
arXiv Detail & Related papers (2024-03-22T10:05:21Z)
Transductive Reward Inference on Graph [53.003245457089406]
We develop a reward inference method based on the contextual properties of information propagation on graphs. We leverage both the available data and limited reward annotations to construct a reward propagation graph. We employ the constructed graph for transductive reward inference, thereby estimating rewards for unlabelled data.
arXiv Detail & Related papers (2024-02-06T03:31:28Z)
STARC: A General Framework For Quantifying Differences Between Reward Functions [52.69620361363209]
We provide a class of pseudometrics on the space of all reward functions that we call STARC metrics. We show that STARC metrics induce both an upper and a lower bound on worst-case regret. We also identify a number of issues with reward metrics proposed by earlier works.
arXiv Detail & Related papers (2023-09-26T20:31:19Z)
Dynamics-Aware Comparison of Learned Reward Functions [21.159457412742356]
The ability to learn reward functions plays an important role in enabling the deployment of intelligent agents in the real world. Reward functions are typically compared by considering the behavior of optimized policies, but this approach conflates deficiencies in the reward function with those of the policy search algorithm used to optimize it. We propose the Dynamics-Aware Reward Distance (DARD), a new reward pseudometric.
arXiv Detail & Related papers (2022-01-25T03:48:00Z)
Reward function shape exploration in adversarial imitation learning: an empirical study [9.817069267241575]
In adversarial imitation learning algorithms (AILs), no true rewards are obtained from the environment for learning the strategy. We design several representative reward function shapes and compare their performances by large-scale experiments.
arXiv Detail & Related papers (2021-04-14T08:21:49Z)
Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification [133.20816939521941]
In the standard Markov decision process formalism, users specify tasks by writing down a reward function. In many scenarios, the user is unable to describe the task in words or numbers, but can readily provide examples of what the world would look like if the task were solved. Motivated by this observation, we derive a control algorithm that aims to visit states that have a high probability of leading to successful outcomes, given only examples of successful outcome states.
arXiv Detail & Related papers (2021-03-23T16:19:55Z)
Information Directed Reward Learning for Reinforcement Learning [64.33774245655401]
We learn a model of the reward function that allows standard RL algorithms to achieve high expected return with as few expert queries as possible. In contrast to prior active reward learning methods designed for specific types of queries, IDRL naturally accommodates different query types. We support our findings with extensive evaluations in multiple environments and with different types of queries.
arXiv Detail & Related papers (2021-02-24T18:46:42Z)
Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization [43.51553742077343]
inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. This paper presents an IRL framework called Bayesian optimization-IRL (BO-IRL) which identifies multiple solutions consistent with the expert demonstrations.
arXiv Detail & Related papers (2020-11-17T10:17:45Z)
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies [88.0813215220342]
Some modalities can more easily contribute to the classification results than others. We develop a method based on the log-Sobolev inequality, which bounds the functional entropy with the functional-Fisher-information. On the two challenging multi-modal datasets VQA-CPv2 and SocialIQ, we obtain state-of-the-art results while more uniformly exploiting the modalities.
arXiv Detail & Related papers (2020-10-21T07:40:33Z)
Quantifying Differences in Reward Functions [24.66221171351157]
We introduce the Equivalent-Policy Invariant Comparison (EPIC) distance to quantify the difference between two reward functions directly. We prove EPIC is invariant on an equivalence class of reward functions that always induce the same optimal policy.
arXiv Detail & Related papers (2020-06-24T17:35:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.