Related papers: Open-Ended Reinforcement Learning with Neural Reward Functions

Open-Ended Reinforcement Learning with Neural Reward Functions

URL: http://arxiv.org/abs/2202.08266v1
Date: Wed, 16 Feb 2022 15:55:22 GMT
Title: Open-Ended Reinforcement Learning with Neural Reward Functions
Authors: Robert Meier and Asier Mujika
Abstract summary: In high-dimensional robotic environments our approach learns a wide range of interesting skills including front-flips for Half-Cheetah and one-legged running for Humanoid. In the pixel-based Montezuma's Revenge environment our method also works with minimal changes and it learns complex skills that involve interacting with items and visiting diverse locations.
Score: 2.4366811507669115
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Inspired by the great success of unsupervised learning in Computer Vision and Natural Language Processing, the Reinforcement Learning community has recently started to focus more on unsupervised discovery of skills. Most current approaches, like DIAYN or DADS, optimize some form of mutual information objective. We propose a different approach that uses reward functions encoded by neural networks. These are trained iteratively to reward more complex behavior. In high-dimensional robotic environments our approach learns a wide range of interesting skills including front-flips for Half-Cheetah and one-legged running for Humanoid. In the pixel-based Montezuma's Revenge environment our method also works with minimal changes and it learns complex skills that involve interacting with items and visiting diverse locations. A web version of this paper which shows animations for the different skills is available in https://as.inf.ethz.ch/research/open_ended_RL/main.html

Related papers

RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in One-Shot [56.130215236125224]
A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots. Recent research in one-shot imitation learning has shown promise in transferring trained policies to new tasks based on demonstrations. This paper aims to unlock the potential for an agent to generalize to hundreds of real-world skills with multi-modal perception.
arXiv Detail & Related papers (2023-07-02T15:33:31Z)
Skill Reinforcement Learning and Planning for Open-World Long-Horizon Tasks [31.084848672383185]
We study building multi-task agents in open-world environments. We convert the multi-task learning problem into learning basic skills and planning over the skills. Our method accomplishes 40 diverse Minecraft tasks, where many tasks require sequentially executing for more than 10 skills.
arXiv Detail & Related papers (2023-03-29T09:45:50Z)
Choreographer: Learning and Adapting Skills in Imagination [60.09911483010824]
We present Choreographer, a model-based agent that exploits its world model to learn and adapt skills in imagination. Our method decouples the exploration and skill learning processes, being able to discover skills in the latent state space of the model. Choreographer is able to learn skills both from offline data, and by collecting data simultaneously with an exploration policy.
arXiv Detail & Related papers (2022-11-23T23:31:14Z)
Lipschitz-constrained Unsupervised Skill Discovery [91.51219447057817]
Lipschitz-constrained Skill Discovery (LSD) encourages the agent to discover more diverse, dynamic, and far-reaching skills. LSD outperforms previous approaches in terms of skill diversity, state space coverage, and performance on seven downstream tasks.
arXiv Detail & Related papers (2022-02-02T08:29:04Z)
Inducing Structure in Reward Learning by Learning Features [31.413656752926208]
We introduce a novel type of human input for teaching features and an algorithm that utilizes it to learn complex features from the raw state space. We demonstrate our method in settings where all features have to be learned from scratch, as well as where some of the features are known.
arXiv Detail & Related papers (2022-01-18T16:02:29Z)
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset. We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z)
Learning Affordance Landscapes for Interaction Exploration in 3D Environments [101.90004767771897]
Embodied agents must be able to master how their environment works. We introduce a reinforcement learning approach for exploration for interaction. We demonstrate our idea with AI2-iTHOR.
arXiv Detail & Related papers (2020-08-21T00:29:36Z)
ELSIM: End-to-end learning of reusable skills through intrinsic motivation [0.0]
We present a novel reinforcement learning architecture which hierarchically learns and represents self-generated skills in an end-to-end way. With this architecture, an agent focuses only on task-rewarded skills while keeping the learning process of skills bottom-up.
arXiv Detail & Related papers (2020-06-23T11:20:46Z)
Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning [81.12201426668894]
We develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks. We show that our proposed algorithm provides substantial improvement in learning efficiency, making reward-free real-world training feasible. We also demonstrate that the learned skills can be composed using model predictive control for goal-oriented navigation, without any additional training.
arXiv Detail & Related papers (2020-04-27T17:38:53Z)
Learning as Reinforcement: Applying Principles of Neuroscience for More General Reinforcement Learning Agents [1.0742675209112622]
We implement an architecture founded in principles of experimental neuroscience, by combining computationally efficient abstractions of biological algorithms. Our approach is inspired by research on spike-timing dependent plasticity, the transition between short and long term memory, and the role of various neurotransmitters in rewarding curiosity. The Neurons-in-a-Box architecture can learn in a wholly generalizable manner, and demonstrates an efficient way to build and apply representations without explicitly optimizing over a set of criteria or actions.
arXiv Detail & Related papers (2020-04-20T04:06:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.