Learning-based Cooperative Robotic Paper Wrapping: A Unified Control Policy with Residual Force Control
- URL: http://arxiv.org/abs/2511.03181v1
- Date: Wed, 05 Nov 2025 04:55:35 GMT
- Title: Learning-based Cooperative Robotic Paper Wrapping: A Unified Control Policy with Residual Force Control
- Authors: Rewida Ali, Cristian C. Beltran-Hernandez, Weiwei Wan, Kensuke Harada,
- Abstract summary: We propose a learning-based framework that integrates a high-level task planner powered by a large language model with a low-level hybrid imitation learning and reinforcement learning policy.<n>At its core is a Sub-task Aware Robotic Transformer (START) that learns a unified policy from human demonstrations.<n>We show that the unified transformer-based policy reduces the need for specialized models, allows controlled human supervision, and effectively bridges high-level intent with the fine-grained force control required for deformable object manipulation.
- Score: 11.21445976755808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-robot cooperation is essential in environments such as warehouses and retail stores, where workers frequently handle deformable objects like paper, bags, and fabrics. Coordinating robotic actions with human assistance remains difficult due to the unpredictable dynamics of deformable materials and the need for adaptive force control. To explore this challenge, we focus on the task of gift wrapping, which exemplifies a long-horizon manipulation problem involving precise folding, controlled creasing, and secure fixation of paper. Success is achieved when the robot completes the sequence to produce a neatly wrapped package with clean folds and no tears. We propose a learning-based framework that integrates a high-level task planner powered by a large language model (LLM) with a low-level hybrid imitation learning (IL) and reinforcement learning (RL) policy. At its core is a Sub-task Aware Robotic Transformer (START) that learns a unified policy from human demonstrations. The key novelty lies in capturing long-range temporal dependencies across the full wrapping sequence within a single model. Unlike vanilla Action Chunking with Transformer (ACT), typically applied to short tasks, our method introduces sub-task IDs that provide explicit temporal grounding. This enables robust performance across the entire wrapping process and supports flexible execution, as the policy learns sub-goals rather than merely replicating motion sequences. Our framework achieves a 97% success rate on real-world wrapping tasks. We show that the unified transformer-based policy reduces the need for specialized models, allows controlled human supervision, and effectively bridges high-level intent with the fine-grained force control required for deformable object manipulation.
Related papers
- Ctrl-World: A Controllable Generative World Model for Robot Manipulation [53.71061464925014]
Generalist robot policies can perform a wide range of manipulation skills.<n> evaluating and improving their ability with unfamiliar objects and instructions remains a significant challenge.<n>World models offer a promising, scalable alternative by enabling policies to rollout within imagination space.
arXiv Detail & Related papers (2025-10-11T09:13:10Z) - ImaginationPolicy: Towards Generalizable, Precise and Reliable End-to-End Policy for Robotic Manipulation [46.06124092071133]
We propose a novel Chain of Moving Oriented Keypoints (CoMOK) formulation for robotic manipulation.<n>Our formulation is used as the action representation of a neural policy, which can be trained in an end-to-end fashion.
arXiv Detail & Related papers (2025-09-25T07:29:07Z) - Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids [56.892520712892804]
We introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three dexterous manipulation tasks.<n>We demonstrate high success rates on unseen objects and robust, adaptive policy behaviors.
arXiv Detail & Related papers (2025-02-27T18:59:52Z) - CAIMAN: Causal Action Influence Detection for Sample-efficient Loco-manipulation [17.94272840532448]
We present CAIMAN, a reinforcement learning framework that encourages robots to gain control over other entities in the environment.<n>We empirically demonstrate CAIMAN's superior sample efficiency and adaptability to diverse scenarios in simulation.
arXiv Detail & Related papers (2025-02-02T16:16:53Z) - Learning Diffusion Policies from Demonstrations For Compliant Contact-rich Manipulation [5.1245307851495]
This paper introduces Diffusion Policies For Compliant Manipulation (DIPCOM), a novel diffusion-based framework for compliant control tasks.
By leveraging generative diffusion models, we develop a policy that predicts Cartesian end-effector poses and adjusts arm stiffness to maintain the necessary force.
Our approach enhances force control through multimodal distribution modeling, improves the integration of diffusion policies in compliance control, and extends our previous work by demonstrating its effectiveness in real-world tasks.
arXiv Detail & Related papers (2024-10-25T00:56:15Z) - Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.<n>Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.<n>We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - Nonprehensile Planar Manipulation through Reinforcement Learning with
Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers.
We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies.
We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z) - Efficient Skill Acquisition for Complex Manipulation Tasks in Obstructed
Environments [18.348489257164356]
We propose a system for efficient skill acquisition that leverages an object-centric generative model (OCGM) for versatile goal identification.
OCGM enables one-shot target object identification and re-identification in new scenes, allowing MP to guide the robot to the target object while avoiding obstacles.
arXiv Detail & Related papers (2023-03-06T18:49:59Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Robot Learning of Mobile Manipulation with Reachability Behavior Priors [38.49783454634775]
Mobile Manipulation (MM) systems are ideal candidates for taking up the role of a personal assistant in unstructured real-world environments.
Among other challenges, MM requires effective coordination of the robot's embodiments for executing tasks that require both mobility and manipulation.
We study the integration of robotic reachability priors in actor-critic RL methods for accelerating the learning of MM for reaching and fetching tasks.
arXiv Detail & Related papers (2022-03-08T12:44:42Z) - Towards Coordinated Robot Motions: End-to-End Learning of Motion
Policies on Transform Trees [63.31965375413414]
We propose to solve multi-task problems through learning structured policies from human demonstrations.
Our structured policy is inspired by RMPflow, a framework for combining subtask policies on different spaces.
We derive an end-to-end learning objective function that is suitable for the multi-task problem.
arXiv Detail & Related papers (2020-12-24T22:46:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.