Learning Diverse Skills for Local Navigation under Multi-constraint
Optimality
- URL: http://arxiv.org/abs/2310.02440v1
- Date: Tue, 3 Oct 2023 21:21:21 GMT
- Title: Learning Diverse Skills for Local Navigation under Multi-constraint
Optimality
- Authors: Jin Cheng, Marin Vlastelica, Pavel Kolev, Chenhao Li, Georg Martius
- Abstract summary: In this work, we take a constrained optimization viewpoint on the quality-diversity trade-off.
We show that we can obtain diverse policies while imposing constraints on their value functions which are defined through distinct rewards.
Our trained policies transfer well to the real 12-DoF quadruped robot, Solo12.
- Score: 27.310655303502305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite many successful applications of data-driven control in robotics,
extracting meaningful diverse behaviors remains a challenge. Typically, task
performance needs to be compromised in order to achieve diversity. In many
scenarios, task requirements are specified as a multitude of reward terms, each
requiring a different trade-off. In this work, we take a constrained
optimization viewpoint on the quality-diversity trade-off and show that we can
obtain diverse policies while imposing constraints on their value functions
which are defined through distinct rewards. In line with previous work, further
control of the diversity level can be achieved through an attract-repel reward
term motivated by the Van der Waals force. We demonstrate the effectiveness of
our method on a local navigation task where a quadruped robot needs to reach
the target within a finite horizon. Finally, our trained policies transfer well
to the real 12-DoF quadruped robot, Solo12, and exhibit diverse agile behaviors
with successful obstacle traversal.
Related papers
- Dual-Force: Enhanced Offline Diversity Maximization under Imitation Constraints [24.544586300399843]
In this work, we present a novel offline algorithm that enhances diversity using an objective based on Van der Waals (VdW) force algorithms.
Our algorithm provides a zero-shot recall of all skills encountered during training, significantly expanding the set of skills learned in prior work.
arXiv Detail & Related papers (2025-01-08T11:20:48Z) - GRAPE: Generalizing Robot Policy via Preference Alignment [58.419992317452376]
We present GRAPE: Generalizing Robot Policy via Preference Alignment.
We show GRAPE increases success rates on in-domain and unseen manipulation tasks by 51.79% and 58.20%, respectively.
GRAPE can be aligned with various objectives, such as safety and efficiency, reducing collision rates by 37.44% and rollout step-length by 11.15%, respectively.
arXiv Detail & Related papers (2024-11-28T18:30:10Z) - QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds [51.05639500325598]
We introduce QuadrupedGPT, designed to follow diverse commands with agility comparable to that of a pet.
Our agent shows proficiency in handling diverse tasks and intricate instructions, representing a significant step toward the development of versatile quadruped agents.
arXiv Detail & Related papers (2024-06-24T12:14:24Z) - Offline Diversity Maximization Under Imitation Constraints [23.761620064055897]
We propose a principled offline algorithm for unsupervised skill discovery.
Our main analytical contribution is to connect Fenchel duality, reinforcement learning, and unsupervised skill discovery.
We demonstrate the effectiveness of our method on the standard offline benchmark D4RL.
arXiv Detail & Related papers (2023-07-21T06:12:39Z) - Robust and Versatile Bipedal Jumping Control through Reinforcement
Learning [141.56016556936865]
This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world.
We present a reinforcement learning framework for training a robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions.
We develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to a short-term I/O history.
arXiv Detail & Related papers (2023-02-19T01:06:09Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Learning Transferable Motor Skills with Hierarchical Latent Mixture
Policies [37.09286945259353]
We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model.
We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours.
arXiv Detail & Related papers (2021-12-09T17:37:14Z) - Diversity-based Trajectory and Goal Selection with Hindsight Experience
Replay [8.259694128526112]
We propose diversity-based trajectory and goal selection with HER (DTGSH)
We show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.
arXiv Detail & Related papers (2021-08-17T21:34:24Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.