Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation
- URL: http://arxiv.org/abs/2404.04219v1
- Date: Fri, 5 Apr 2024 17:05:45 GMT
- Title: Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation
- Authors: Lanpei Li, Enrico Donato, Vincenzo Lomonaco, Egidio Falotico,
- Abstract summary: Soft robotic hands offer flexibility and adaptability during object grasping and manipulation.
We introduce a Continual Policy Distillation framework to acquire a versatile controller for in-hand manipulation.
- Score: 5.601529531526852
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Dexterous manipulation, often facilitated by multi-fingered robotic hands, holds solid impact for real-world applications. Soft robotic hands, due to their compliant nature, offer flexibility and adaptability during object grasping and manipulation. Yet, benefits come with challenges, particularly in the control development for finger coordination. Reinforcement Learning (RL) can be employed to train object-specific in-hand manipulation policies, but limiting adaptability and generalizability. We introduce a Continual Policy Distillation (CPD) framework to acquire a versatile controller for in-hand manipulation, to rotate different objects in shape and size within a four-fingered soft gripper. The framework leverages Policy Distillation (PD) to transfer knowledge from expert policies to a continually evolving student policy network. Exemplar-based rehearsal methods are then integrated to mitigate catastrophic forgetting and enhance generalization. The performance of the CPD framework over various replay strategies demonstrates its effectiveness in consolidating knowledge from multiple experts and achieving versatile and adaptive behaviours for in-hand manipulation tasks.
Related papers
- Multi-Goal Dexterous Hand Manipulation using Probabilistic Model-based Reinforcement Learning [2.34860173297653]
This paper tackles the challenge of learning multi-goal dexterous hand manipulation tasks using model-based Reinforcement Learning.
We propose Goal-Conditioned Probabilistic Model Predictive Control (GC-PMPC) to describe the high-dimensional dexterous hand dynamics.
It successfully drives a cable-driven Dexterous hand, DexHand 021 with 12 Active DOFs and 5 tactile sensors, to learn manipulating a cubic die to three goal poses within approximately 80 minutes of interactions.
arXiv Detail & Related papers (2025-04-30T12:44:38Z) - ForceGrip: Reference-Free Curriculum Learning for Realistic Grip Force Control in VR Hand Manipulation [0.10995326465245926]
We present ForceGrip, a deep learning agent that synthesizes realistic hand manipulation motions.
We employ a three-phase curriculum learning framework comprising Finger Positioning, Intention Adaptation, and Dynamic Stabilization.
Our evaluations reveal ForceGrip's superior force controllability and plausibility compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-03-11T05:39:07Z) - COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping [56.907940167333656]
Occluded robot grasping is where the desired grasp poses are kinematically infeasible due to environmental constraints such as surface collisions.
Traditional robot manipulation approaches struggle with the complexity of non-prehensile or bimanual strategies commonly used by humans.
We introduce Constraint-based Manipulation for Bimanual Occluded Grasping (COMBO-Grasp), a learning-based approach which leverages two coordinated policies.
arXiv Detail & Related papers (2025-02-12T01:31:01Z) - FDPP: Fine-tune Diffusion Policy with Human Preference [57.44575105114056]
Fine-tuning Diffusion Policy with Human Preference learns a reward function through preference-based learning.
This reward is then used to fine-tune the pre-trained policy with reinforcement learning.
Experiments demonstrate that FDPP effectively customizes policy behavior without compromising performance.
arXiv Detail & Related papers (2025-01-14T17:15:27Z) - DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation [78.60543357822957]
Dexterous manipulation with contact-rich interactions is crucial for advanced robotics.
We introduce DexHandDiff, an interaction-aware diffusion planning framework for adaptive dexterous manipulation.
Our framework achieves an average of 70.7% success rate on goal adaptive dexterous tasks, highlighting its robustness and flexibility in contact-rich manipulation.
arXiv Detail & Related papers (2024-11-27T18:03:26Z) - Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation [5.1245307851495]
This paper introduces Diffusion Policies For Compliant Manipulation (DIPCOM), a novel diffusion-based framework for compliant control tasks.
By leveraging generative diffusion models, we develop a policy that predicts Cartesian end-effector poses and adjusts arm stiffness to maintain the necessary force.
Our approach enhances force control through multimodal distribution modeling, improves the integration of diffusion policies in compliance control, and extends our previous work by demonstrating its effectiveness in real-world tasks.
arXiv Detail & Related papers (2024-10-25T00:56:15Z) - Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation [12.377289165111028]
Reinforcement learning (RL) often necessitates a meticulous Markov Decision Process (MDP) design tailored to each task.
This work proposes a systematic approach to behavior synthesis and control for multi-contact loco-manipulation tasks.
We define a task-independent MDP to train RL policies using only a single demonstration per task generated from a model-based trajectory.
arXiv Detail & Related papers (2024-10-17T17:46:27Z) - Twisting Lids Off with Two Hands [82.21668778600414]
We show how policies trained in simulation can be effectively and efficiently transferred to the real world.
Specifically, we consider the problem of twisting lids of various bottle-like objects with two hands.
This is the first sim-to-real RL system that enables such capabilities on bimanual multi-fingered hands.
arXiv Detail & Related papers (2024-03-04T18:59:30Z) - Robust Driving Policy Learning with Guided Meta Reinforcement Learning [49.860391298275616]
We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy.
By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy.
We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
arXiv Detail & Related papers (2023-07-19T17:42:36Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Personalized Rehabilitation Robotics based on Online Learning Control [62.6606062732021]
We propose a novel online learning control architecture, which is able to personalize the control force at run time to each individual user.
We evaluate our method in an experimental user study, where the learning controller is shown to provide personalized control, while also obtaining safe interaction forces.
arXiv Detail & Related papers (2021-10-01T15:28:44Z) - On the Emergence of Whole-body Strategies from Humanoid Robot
Push-recovery Learning [32.070068456106895]
We apply model-free Deep Reinforcement Learning for training a general and robust humanoid push-recovery policy in a simulation environment.
Our method targets high-dimensional whole-body humanoid control and is validated on the iCub humanoid.
arXiv Detail & Related papers (2021-04-29T17:49:20Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Learning Whole-body Motor Skills for Humanoids [25.443880385966114]
This paper presents a hierarchical framework for Deep Reinforcement Learning that acquires motor skills for a variety of push recovery and balancing behaviors.
The policy is trained in a physics simulator with realistic setting of robot model and low-level impedance control that are easy to transfer the learned skills to real robots.
arXiv Detail & Related papers (2020-02-07T19:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.