Practice Makes Perfect: Planning to Learn Skill Parameter Policies
- URL: http://arxiv.org/abs/2402.15025v2
- Date: Sat, 18 May 2024 15:54:21 GMT
- Title: Practice Makes Perfect: Planning to Learn Skill Parameter Policies
- Authors: Nishanth Kumar, Tom Silver, Willie McClinton, Linfeng Zhao, Stephen Proulx, Tomás Lozano-Pérez, Leslie Pack Kaelbling, Jennifer Barry,
- Abstract summary: In this work, we focus on the active learning problem of choosing which skills to practice to maximize expected future task success.
We propose that the robot should estimate the competence of each skill, extrapolate the competence, and situate the skill in the task distribution through competence-aware planning.
This approach is implemented within a fully autonomous system where the robot repeatedly plans, practices, and learns without any environment resets.
- Score: 34.51008914846429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One promising approach towards effective robot decision making in complex, long-horizon tasks is to sequence together parameterized skills. We consider a setting where a robot is initially equipped with (1) a library of parameterized skills, (2) an AI planner for sequencing together the skills given a goal, and (3) a very general prior distribution for selecting skill parameters. Once deployed, the robot should rapidly and autonomously learn to improve its performance by specializing its skill parameter selection policy to the particular objects, goals, and constraints in its environment. In this work, we focus on the active learning problem of choosing which skills to practice to maximize expected future task success. We propose that the robot should estimate the competence of each skill, extrapolate the competence (asking: "how much would the competence improve through practice?"), and situate the skill in the task distribution through competence-aware planning. This approach is implemented within a fully autonomous system where the robot repeatedly plans, practices, and learns without any environment resets. Through experiments in simulation, we find that our approach learns effective parameter policies more sample-efficiently than several baselines. Experiments in the real-world demonstrate our approach's ability to handle noise from perception and control and improve the robot's ability to solve two long-horizon mobile-manipulation tasks after a few hours of autonomous practice. Project website: http://ees.csail.mit.edu
Related papers
- RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [68.70755196744533]
RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation.
Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
arXiv Detail & Related papers (2023-11-02T17:59:21Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Discovering Unsupervised Behaviours from Full-State Trajectories [1.827510863075184]
We propose an analysis of Autonomous Robots Realising their Abilities; a Quality-Diversity algorithm that autonomously finds behavioural characterisations.
We evaluate this approach on a simulated robotic environment, where the robot has to autonomously discover its abilities from its full-state trajectories.
More specifically, the analysed approach autonomously finds policies that make the robot move to diverse positions, but also utilise its legs in diverse ways, and even perform half-rolls.
arXiv Detail & Related papers (2022-11-22T16:57:52Z) - Autonomous Open-Ended Learning of Tasks with Non-Stationary
Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals.
While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks.
In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture.
Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z) - Skill-based Multi-objective Reinforcement Learning of Industrial Robot
Tasks with Planning and Knowledge Integration [0.4949816699298335]
We introduce an approach that provides a combination of task-level planning with targeted learning of scenario-specific parameters for skill-based systems.
We demonstrate the efficacy and versatility of our approach by learning skill parameters for two different contact-rich tasks.
arXiv Detail & Related papers (2022-03-18T16:03:27Z) - Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models [29.34375999491465]
A core challenge for an autonomous agent acting in the real world is to adapt its repertoire of skills to cope with its noisy perception and dynamics.
To scale learning of skills to long-horizon tasks, robots should be able to learn and later refine their skills in a structured manner.
We proposeSAC-GMM, a novel hybrid approach that learns robot skills through a dynamical system and adapts the learned skills in their own trajectory distribution space.
arXiv Detail & Related papers (2021-11-25T15:36:11Z) - Lifelong Robotic Reinforcement Learning by Retaining Experiences [61.79346922421323]
Many multi-task reinforcement learning efforts assume the robot can collect data from all tasks at all times.
In this work, we study a practical sequential multi-task RL problem motivated by the practical constraints of physical robotic systems.
We derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot's skill-set.
arXiv Detail & Related papers (2021-09-19T18:00:51Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.