Unsupervised Behaviour Discovery with Quality-Diversity Optimisation
- URL: http://arxiv.org/abs/2106.05648v1
- Date: Thu, 10 Jun 2021 10:40:18 GMT
- Title: Unsupervised Behaviour Discovery with Quality-Diversity Optimisation
- Authors: Luca Grillotti and Antoine Cully
- Abstract summary: Quality-Diversity algorithms refer to a class of evolutionary algorithms designed to find a collection of diverse and high-performing solutions to a given problem.
In robotics, such algorithms can be used for generating a collection of controllers covering most of the possible behaviours of a robot.
In this paper, we introduce: Autonomous Robots Realising their Abilities, an algorithm that uses a dimensionality reduction technique to automatically learn behavioural descriptors.
- Score: 1.0152838128195467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quality-Diversity algorithms refer to a class of evolutionary algorithms
designed to find a collection of diverse and high-performing solutions to a
given problem. In robotics, such algorithms can be used for generating a
collection of controllers covering most of the possible behaviours of a robot.
To do so, these algorithms associate a behavioural descriptor to each of these
behaviours. Each behavioural descriptor is used for estimating the novelty of
one behaviour compared to the others. In most existing algorithms, the
behavioural descriptor needs to be hand-coded, thus requiring prior knowledge
about the task to solve. In this paper, we introduce: Autonomous Robots
Realising their Abilities, an algorithm that uses a dimensionality reduction
technique to automatically learn behavioural descriptors based on raw sensory
data. The performance of this algorithm is assessed on three robotic tasks in
simulation. The experimental results show that it performs similarly to
traditional hand-coded approaches without the requirement to provide any
hand-coded behavioural descriptor. In the collection of diverse and
high-performing solutions, it also manages to find behaviours that are novel
with respect to more features than its hand-coded baselines. Finally, we
introduce a variant of the algorithm which is robust to the dimensionality of
the behavioural descriptor space.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - Multi-Dimensional Ability Diagnosis for Machine Learning Algorithms [88.93372675846123]
We propose a task-agnostic evaluation framework Camilla for evaluating machine learning algorithms.
We use cognitive diagnosis assumptions and neural networks to learn the complex interactions among algorithms, samples and the skills of each sample.
In our experiments, Camilla outperforms state-of-the-art baselines on the metric reliability, rank consistency and rank stability.
arXiv Detail & Related papers (2023-07-14T03:15:56Z) - Discovering Unsupervised Behaviours from Full-State Trajectories [1.827510863075184]
We propose an analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations.
We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories.
More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.
arXiv Detail & Related papers (2022-11-22T16:57:52Z) - Relevance-guided Unsupervised Discovery of Abilities with
Quality-Diversity Algorithms [1.827510863075184]
We introduce Relevance-guided Unsupervised Discovery of Abilities; a Quality-Diversity algorithm that autonomously finds a behavioural characterisation tailored to the task at hand.
We evaluate our approach on a simulated robotic environment, where the robot has to autonomously discover its abilities based on its full sensory data.
arXiv Detail & Related papers (2022-04-21T00:29:38Z) - A distributed, plug-n-play algorithm for multi-robot applications with a
priori non-computable objective functions [2.2452191187045383]
In multi-robot applications, the user-defined objectives of the mission can be cast as a general optimization problem.
Standard gradient-descent-like algorithms are not applicable to these problems.
We introduce a new algorithm that carefully designs each robot's subcost function, the optimization of which can accomplish the overall team objective.
arXiv Detail & Related papers (2021-11-14T20:40:00Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - A Systematic Characterization of Sampling Algorithms for Open-ended
Language Generation [71.31905141672529]
We study the widely adopted ancestral sampling algorithms for auto-regressive language models.
We identify three key properties that are shared among them: entropy reduction, order preservation, and slope preservation.
We find that the set of sampling algorithms that satisfies these properties performs on par with the existing sampling algorithms.
arXiv Detail & Related papers (2020-09-15T17:28:42Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z) - Model-Based Quality-Diversity Search for Efficient Robot Learning [28.049034339935933]
novelty based Quality-Diversity(QD) algorithm.
Network is trained concurrently to the repertoire and is used to avoid executing unpromising actions in the novelty search process.
Experiments show that enhancing a QD algorithm with such a forward model improves the sample-efficiency and performance of the evolutionary process and the skill adaptation.
arXiv Detail & Related papers (2020-08-11T09:02:18Z) - Fast and stable MAP-Elites in noisy domains using deep grids [1.827510863075184]
Deep-Grid MAP-Elites is a variant of the MAP-Elites algorithm that uses an archive of similar previously encountered solutions to approximate the performance of a solution.
We show that this simple approach is significantly more resilient to noise on the behavioural descriptors, while achieving competitive performances in terms of fitness optimisation.
arXiv Detail & Related papers (2020-06-25T08:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.