A Novel Approach to Curiosity and Explainable Reinforcement Learning via
Interpretable Sub-Goals
- URL: http://arxiv.org/abs/2104.06630v1
- Date: Wed, 14 Apr 2021 05:21:13 GMT
- Title: A Novel Approach to Curiosity and Explainable Reinforcement Learning via
Interpretable Sub-Goals
- Authors: Connor van Rossum, Candice Feinberg, Adam Abu Shumays, Kyle Baxter,
Benedek Bartha
- Abstract summary: Two key challenges within Reinforcement Learning involve improving (a) agent learning within environments with sparse extrinsic rewards and (b) the explainability of agent actions.
We describe a curious subgoal focused agent to address both these challenges.
We use a novel method for curiosity produced from a Generative Adrial Network (GAN) based model of environment transitions that is robust to environment transitions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two key challenges within Reinforcement Learning involve improving (a) agent
learning within environments with sparse extrinsic rewards and (b) the
explainability of agent actions. We describe a curious subgoal focused agent to
address both these challenges. We use a novel method for curiosity produced
from a Generative Adversarial Network (GAN) based model of environment
transitions that is robust to stochastic environment transitions. Additionally,
we use a subgoal generating network to guide navigation. The explainability of
the agent's behavior is increased by decomposing complex tasks into a sequence
of interpretable subgoals that do not require any manual design. We show that
this method also enables the agent to solve challenging procedurally-generated
tasks that contain stochastic transitions above other state-of-the-art methods.
Related papers
- Towards Deviation-Robust Agent Navigation via Perturbation-Aware
Contrastive Learning [125.61772424068903]
Vision-and-language navigation (VLN) asks an agent to follow a given language instruction to navigate through a real 3D environment.
We present a model-agnostic training paradigm, called Progressive Perturbation-aware Contrastive Learning (PROPER) to enhance the generalization ability of existing VLN agents.
arXiv Detail & Related papers (2024-03-09T02:34:13Z) - Self-Supervised Reinforcement Learning that Transfers using Random
Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards.
Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - A Closer Look at Reward Decomposition for High-Level Robotic
Explanations [18.019811754800767]
We propose an explainable Q-Map learning framework that combines reward decomposition with abstracted action spaces.
We demonstrate the effectiveness of our framework through quantitative and qualitative analysis of two robotic scenarios.
arXiv Detail & Related papers (2023-04-25T16:01:42Z) - Investigating the role of model-based learning in exploration and
transfer [11.652741003589027]
In this paper, we investigate transfer learning in the context of model-based agents.
We find that a model-based approach outperforms controlled model-free baselines for transfer learning.
Our results show that intrinsic exploration combined with environment models present a viable direction towards agents that are self-supervised and able to generalize to novel reward functions.
arXiv Detail & Related papers (2023-02-08T11:49:58Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Language-guided Navigation via Cross-Modal Grounding and Alternate
Adversarial Learning [66.9937776799536]
The emerging vision-and-language navigation (VLN) problem aims at learning to navigate an agent to the target location in unseen photo-realistic environments.
The main challenges of VLN arise mainly from two aspects: first, the agent needs to attend to the meaningful paragraphs of the language instruction corresponding to the dynamically-varying visual environments.
We propose a cross-modal grounding module to equip the agent with a better ability to track the correspondence between the textual and visual modalities.
arXiv Detail & Related papers (2020-11-22T09:13:46Z) - Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent.
We present a new approach to self-supervised exploration and fast adaptation to new tasks.
Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.