Quality Diversity under Sparse Reward and Sparse Interaction:
Application to Grasping in Robotics
- URL: http://arxiv.org/abs/2308.05483v2
- Date: Tue, 31 Oct 2023 10:15:31 GMT
- Title: Quality Diversity under Sparse Reward and Sparse Interaction:
Application to Grasping in Robotics
- Authors: J. Huber, F. H\'el\'enon, M. Coninx, F. Ben Amar, S. Doncieux
- Abstract summary: Quality-Diversity (QD) methods are algorithms that aim to generate a set of diverse and high-performing solutions to a given problem.
The present work studies how QD can address grasping in robotics.
Experiments have been conducted on 15 different methods on 10 grasping domains, corresponding to 2 different robot-gripper setups and 5 standard objects.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quality-Diversity (QD) methods are algorithms that aim to generate a set of
diverse and high-performing solutions to a given problem. Originally developed
for evolutionary robotics, most QD studies are conducted on a limited set of
domains - mainly applied to locomotion, where the fitness and the behavior
signal are dense. Grasping is a crucial task for manipulation in robotics.
Despite the efforts of many research communities, this task is yet to be
solved. Grasping cumulates unprecedented challenges in QD literature: it
suffers from reward sparsity, behavioral sparsity, and behavior space
misalignment. The present work studies how QD can address grasping. Experiments
have been conducted on 15 different methods on 10 grasping domains,
corresponding to 2 different robot-gripper setups and 5 standard objects. An
evaluation framework that distinguishes the evaluation of an algorithm from its
internal components has also been proposed for a fair comparison. The obtained
results show that MAP-Elites variants that select successful solutions in
priority outperform all the compared methods on the studied metrics by a large
margin. We also found experimental evidence that sparse interaction can lead to
deceptive novelty. To our knowledge, the ability to efficiently produce
examples of grasping trajectories demonstrated in this work has no precedent in
the literature.
Related papers
- Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning [50.84938730450622]
We propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning.
Our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios.
Our method can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
arXiv Detail & Related papers (2024-05-22T22:22:25Z) - Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors [57.7003399760813]
We explore advanced Large Language Models (LLMs) and their specialized variants, contributing to this field in several ways.
We uncover a significant correlation between topics and detection performance.
These investigations shed light on the adaptability and robustness of these detection methods across diverse topics.
arXiv Detail & Related papers (2023-12-20T10:53:53Z) - Efficient Quality-Diversity Optimization through Diverse Quality Species [3.428706362109921]
We show that a diverse population of solutions can be found without the limitation of needing an archive or defining the range of behaviors in advance.
We propose Diverse Quality Species (DQS) as an alternative to archive-based Quality-Diversity (QD) algorithms.
arXiv Detail & Related papers (2023-04-14T23:15:51Z) - Assessing Quality-Diversity Neuro-Evolution Algorithms Performance in
Hard Exploration Problems [10.871978893808533]
Quality-Diversity (QD) methods are evolutionary algorithms inspired by nature's ability to produce high-performing niche organisms.
In this paper, we highlight three candidate benchmarks exhibiting control problems in high dimension with exploration difficulties.
We also provide open-source implementations in Jax allowing practitioners to run fast and numerous experiments on few compute resources.
arXiv Detail & Related papers (2022-11-24T18:04:12Z) - Planning for Sample Efficient Imitation Learning [52.44953015011569]
Current imitation algorithms struggle to achieve high performance and high in-environment sample efficiency simultaneously.
We propose EfficientImitate, a planning-based imitation learning method that can achieve high in-environment sample efficiency and performance simultaneously.
Experimental results show that EI achieves state-of-the-art results in performance and sample efficiency.
arXiv Detail & Related papers (2022-10-18T05:19:26Z) - Relevance-guided Unsupervised Discovery of Abilities with
Quality-Diversity Algorithms [1.827510863075184]
We introduce Relevance-guided Unsupervised Discovery of Abilities; a Quality-Diversity algorithm that autonomously finds a behavioural characterisation tailored to the task at hand.
We evaluate our approach on a simulated robotic environment, where the robot has to autonomously discover its abilities based on its full sensory data.
arXiv Detail & Related papers (2022-04-21T00:29:38Z) - Learning to Walk Autonomously via Reset-Free Quality-Diversity [73.08073762433376]
Quality-Diversity algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills.
Existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions.
This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments.
arXiv Detail & Related papers (2022-04-07T14:07:51Z) - Few-shot Quality-Diversity Optimization [50.337225556491774]
Quality-Diversity (QD) optimization has been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning.
We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation.
Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.
arXiv Detail & Related papers (2021-09-14T17:12:20Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - Unsupervised Behaviour Discovery with Quality-Diversity Optimisation [1.0152838128195467]
Quality-Diversity algorithms refer to a class of evolutionary algorithms designed to find a collection of diverse and high-performing solutions to a given problem.
In robotics, such algorithms can be used for generating a collection of controllers covering most of the possible behaviours of a robot.
In this paper, we introduce: Autonomous Robots Realising their Abilities, an algorithm that uses a dimensionality reduction technique to automatically learn behavioural descriptors.
arXiv Detail & Related papers (2021-06-10T10:40:18Z) - Fast and stable MAP-Elites in noisy domains using deep grids [1.827510863075184]
Deep-Grid MAP-Elites is a variant of the MAP-Elites algorithm that uses an archive of similar previously encountered solutions to approximate the performance of a solution.
We show that this simple approach is significantly more resilient to noise on the behavioural descriptors, while achieving competitive performances in terms of fitness optimisation.
arXiv Detail & Related papers (2020-06-25T08:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.