Related papers: Reward Learning using Structural Motifs in Inverse Reinforcement Learning

Reward Learning using Structural Motifs in Inverse Reinforcement Learning

URL: http://arxiv.org/abs/2209.13489v1
Date: Sun, 25 Sep 2022 18:34:59 GMT
Title: Reward Learning using Structural Motifs in Inverse Reinforcement Learning
Authors: Raeid Saqur
Abstract summary: Inverse Reinforcement Learning (textitIRL) problem has seen rapid evolution in the past few years, with important applications in domains like robotics, cognition, and health. We explore the inefficacy of current IRL methods in learning an agent's reward function from expert trajectories depicting long-horizon, complex sequential tasks. We propose a novel IRL method, SMIRL, that first learns the (approximate) structure of a task as a finite-state-automaton (FSA), then uses the structural motif to solve the IRL problem.
Score: 3.04585143845864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Inverse Reinforcement Learning (\textit{IRL}) problem has seen rapid evolution in the past few years, with important applications in domains like robotics, cognition, and health. In this work, we explore the inefficacy of current IRL methods in learning an agent's reward function from expert trajectories depicting long-horizon, complex sequential tasks. We hypothesize that imbuing IRL models with structural motifs capturing underlying tasks can enable and enhance their performance. Subsequently, we propose a novel IRL method, SMIRL, that first learns the (approximate) structure of a task as a finite-state-automaton (FSA), then uses the structural motif to solve the IRL problem. We test our model on both discrete grid world and high-dimensional continuous domain environments. We empirically show that our proposed approach successfully learns all four complex tasks, where two foundational IRL baselines fail. Our model also outperforms the baselines in sample efficiency on a simpler toy task. We further show promising test results in a modified continuous domain on tasks with compositional reward functions.

Related papers

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling. Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning. We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z)
Exploring the Precise Dynamics of Single-Layer GAN Models: Leveraging Multi-Feature Discriminators for High-Dimensional Subspace Learning [0.0]
We study the training dynamics of a single-layer GAN model from the perspective of subspace learning. By bridging our analysis to the realm of subspace learning, we systematically compare the efficacy of GAN-based methods against conventional approaches.
arXiv Detail & Related papers (2024-11-01T10:21:12Z)
Reward-free World Models for Online Imitation Learning [25.304836126280424]
We propose a novel approach to online imitation learning that leverages reward-free world models. Our method learns environmental dynamics entirely in latent spaces without reconstruction, enabling efficient and accurate modeling. We evaluate our method on a diverse set of benchmarks, including DMControl, MyoSuite, and ManiSkill2, demonstrating superior empirical performance compared to existing approaches.
arXiv Detail & Related papers (2024-10-17T23:13:32Z)
RILe: Reinforced Imitation Learning [60.63173816209543]
RILe (Reinforced Learning) is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently. Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z)
Self-Supervised Reinforcement Learning that Transfers using Random Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards. Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z)
Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z)
Continual Visual Reinforcement Learning with A Life-Long World Model [55.05017177980985]
We present a new continual learning approach for visual dynamics modeling.<n>We first introduce the life-long world model, which learns task-specific latent dynamics.<n>Then, we address the value estimation challenge for previous tasks with the exploratory-conservative behavior learning approach.
arXiv Detail & Related papers (2023-03-12T05:08:03Z)
Human-Timescale Adaptation in an Open-Ended Task Space [56.55530165036327]
We show that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans. Our results lay the foundation for increasingly general and adaptive RL agents that perform well across ever-larger open-ended domains.
arXiv Detail & Related papers (2023-01-18T15:39:21Z)
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph. Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z)
Context-Hierarchy Inverse Reinforcement Learning [30.71220625227959]
inverse reinforcement learning (IRL) agent learns to act intelligently by observing expert demonstrations and learning the expert's underlying reward function. We present Context Hierarchy IRL(CHIRL), a new IRL algorithm that exploits the context to scale up IRL and learn reward functions of complex behaviors. Experiments on benchmark tasks, including a large scale autonomous driving task in the CARLA simulator, show promising results in scaling up IRL for tasks with complex reward functions.
arXiv Detail & Related papers (2022-02-25T10:29:05Z)
Online reinforcement learning with sparse rewards through an active inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future. Our model is capable of solving sparse-reward problems with a very high sample efficiency. We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z)
Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments [137.86426963572214]
Inverse Reinforcement Learning can extrapolate reward functions from expert demonstrations. We show that our approach, DE-AIRL, is demonstration-efficient and still able to extrapolate reward functions which generalize to the fully procedural domain.
arXiv Detail & Related papers (2020-12-04T11:18:02Z)
Critic PI2: Master Continuous Planning via Policy Improvement with Path Integrals and Deep Actor-Critic Reinforcement Learning [23.25444331531546]
Tree-based planning methods have enjoyed huge success in discrete domains, such as chess and Go. In this paper, we present Critic PI2, which combines the benefits from trajectory optimization, deep actor-critic learning, and model-based reinforcement learning. Our work opens a new direction toward learning the components of a model-based planning system and how to use them.
arXiv Detail & Related papers (2020-11-13T04:14:40Z)
Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph. Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference. Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.