Interpretable Preference-based Reinforcement Learning with
Tree-Structured Reward Functions
- URL: http://arxiv.org/abs/2112.11230v1
- Date: Mon, 20 Dec 2021 09:53:23 GMT
- Title: Interpretable Preference-based Reinforcement Learning with
Tree-Structured Reward Functions
- Authors: Tom Bewley, Freddy Lecue
- Abstract summary: We propose an online, active preference learning algorithm that constructs reward functions with the intrinsically interpretable, compositional structure of a tree.
We demonstrate sample-efficient learning of tree-structured reward functions in several environments, then harness the enhanced interpretability to explore and debug for alignment.
- Score: 2.741266294612776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The potential of reinforcement learning (RL) to deliver aligned and
performant agents is partially bottlenecked by the reward engineering problem.
One alternative to heuristic trial-and-error is preference-based RL (PbRL),
where a reward function is inferred from sparse human feedback. However, prior
PbRL methods lack interpretability of the learned reward structure, which
hampers the ability to assess robustness and alignment. We propose an online,
active preference learning algorithm that constructs reward functions with the
intrinsically interpretable, compositional structure of a tree. Using both
synthetic and human-provided feedback, we demonstrate sample-efficient learning
of tree-structured reward functions in several environments, then harness the
enhanced interpretability to explore and debug for alignment.
Related papers
- Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a new framework based on large language models (LLMs) and decision Tree reasoning (OCTree)
Our key idea is to leverage LLMs' reasoning capabilities to find good feature generation rules without manually specifying the search space.
Our empirical results demonstrate that this simple framework consistently enhances the performance of various prediction models.
arXiv Detail & Related papers (2024-06-12T08:31:34Z) - A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback [6.578074497549894]
Inverse Reinforcement Learning (IRL) and Reinforcement Learning from Human Feedback (RLHF) are pivotal methodologies in reward learning.
This paper introduces a novel linear programming (LP) framework tailored for offline reward learning.
arXiv Detail & Related papers (2024-05-20T23:59:26Z) - Deep Reinforcement Learning from Hierarchical Preference Design [99.46415116087259]
This paper shows by exploiting certain structures, one can ease the reward design process.
We propose a hierarchical reward modeling framework -- HERON for scenarios: (I) The feedback signals naturally present hierarchy; (II) The reward is sparse, but with less important surrogate feedback to help policy learning.
arXiv Detail & Related papers (2023-09-06T00:44:29Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Learning Interpretable Models of Aircraft Handling Behaviour by
Reinforcement Learning from Human Feedback [12.858982225307809]
We use pairwise preferences over simulated flight trajectories to learn an interpretable rule-based model called a reward tree.
We train an RL agent to execute high-quality handling behaviour by using the reward tree as the objective.
arXiv Detail & Related papers (2023-05-26T13:37:59Z) - Reward Learning with Trees: Methods and Evaluation [10.473362152378979]
We propose a method for learning reward trees from preference labels.
We show it to be broadly competitive with neural networks on challenging high-dimensional tasks.
Having found that reward tree learning can be done effectively in complex settings, we then consider why it should be used.
arXiv Detail & Related papers (2022-10-03T15:17:25Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Reward Uncertainty for Exploration in Preference-based Reinforcement
Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms.
Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward.
Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z) - Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task.
We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy.
Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z) - Measure Inducing Classification and Regression Trees for Functional Data [0.0]
We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis.
This is achieved by learning a weighted functional $L2$ space by means of constrained convex optimization.
arXiv Detail & Related papers (2020-10-30T18:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.