Model-Based Quality-Diversity Search for Efficient Robot Learning
- URL: http://arxiv.org/abs/2008.04589v1
- Date: Tue, 11 Aug 2020 09:02:18 GMT
- Title: Model-Based Quality-Diversity Search for Efficient Robot Learning
- Authors: Leon Keller, Daniel Tanneberg, Svenja Stark, Jan Peters
- Abstract summary: novelty based Quality-Diversity(QD) algorithm.
Network is trained concurrently to the repertoire and is used to avoid executing unpromising actions in the novelty search process.
Experiments show that enhancing a QD algorithm with such a forward model improves the sample-efficiency and performance of the evolutionary process and the skill adaptation.
- Score: 28.049034339935933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite recent progress in robot learning, it still remains a challenge to
program a robot to deal with open-ended object manipulation tasks. One approach
that was recently used to autonomously generate a repertoire of diverse skills
is a novelty based Quality-Diversity~(QD) algorithm. However, as most
evolutionary algorithms, QD suffers from sample-inefficiency and, thus, it is
challenging to apply it in real-world scenarios. This paper tackles this
problem by integrating a neural network that predicts the behavior of the
perturbed parameters into a novelty based QD algorithm. In the proposed
Model-based Quality-Diversity search (M-QD), the network is trained
concurrently to the repertoire and is used to avoid executing unpromising
actions in the novelty search process. Furthermore, it is used to adapt the
skills of the final repertoire in order to generalize the skills to different
scenarios. Our experiments show that enhancing a QD algorithm with such a
forward model improves the sample-efficiency and performance of the
evolutionary process and the skill adaptation.
Related papers
- Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks.
We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration.
Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - Discovering Unsupervised Behaviours from Full-State Trajectories [1.827510863075184]
We propose an analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations.
We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories.
More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.
arXiv Detail & Related papers (2022-11-22T16:57:52Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - Dynamics-Aware Quality-Diversity for Efficient Learning of Skill
Repertoires [4.943054375935878]
Quality-Diversity (QD) algorithms are powerful exploration algorithms that allow robots to discover large repertoires of diverse and high-performing skills.
We propose Dynamics-Aware Quality-Diversity (DA-QD), a framework to improve the sample efficiency of QD algorithms.
arXiv Detail & Related papers (2021-09-16T08:35:35Z) - Few-shot Quality-Diversity Optimization [50.337225556491774]
Quality-Diversity (QD) optimization has been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning.
We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation.
Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.
arXiv Detail & Related papers (2021-09-14T17:12:20Z) - Skill Preferences: Learning to Extract and Execute Robotic Skills from
Human Feedback [82.96694147237113]
We present Skill Preferences, an algorithm that learns a model over human preferences and uses it to extract human-aligned skills from offline data.
We show that SkiP enables a simulated kitchen robot to solve complex multi-step manipulation tasks.
arXiv Detail & Related papers (2021-08-11T18:04:08Z) - Unsupervised Behaviour Discovery with Quality-Diversity Optimisation [1.0152838128195467]
Quality-Diversity algorithms refer to a class of evolutionary algorithms designed to find a collection of diverse and high-performing solutions to a given problem.
In robotics, such algorithms can be used for generating a collection of controllers covering most of the possible behaviours of a robot.
In this paper, we introduce: Autonomous Robots Realising their Abilities, an algorithm that uses a dimensionality reduction technique to automatically learn behavioural descriptors.
arXiv Detail & Related papers (2021-06-10T10:40:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.