Hierarchical and Modular Network on Non-prehensile Manipulation in General Environments
- URL: http://arxiv.org/abs/2502.20843v1
- Date: Fri, 28 Feb 2025 08:42:00 GMT
- Title: Hierarchical and Modular Network on Non-prehensile Manipulation in General Environments
- Authors: Yoonyoung Cho, Junhyek Han, Jisu Han, Beomjoon Kim,
- Abstract summary: Non-prehensile manipulation is important for robots to operate in general environments like households.<n>However, prior works on non-prehensile manipulation cannot yet generalize across environments with diverse geometries.<n>We propose a modular and reconfigurable architecture that adaptively reconfigures network modules based on task requirements.<n>We additionally release a simulation-based benchmark featuring nine digital twins of real-world scenes with 353 objects.
- Score: 1.3299507495084417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For robots to operate in general environments like households, they must be able to perform non-prehensile manipulation actions such as toppling and rolling to manipulate ungraspable objects. However, prior works on non-prehensile manipulation cannot yet generalize across environments with diverse geometries. The main challenge lies in adapting to varying environmental constraints: within a cabinet, the robot must avoid walls and ceilings; to lift objects to the top of a step, the robot must account for the step's pose and extent. While deep reinforcement learning (RL) has demonstrated impressive success in non-prehensile manipulation, accounting for such variability presents a challenge for the generalist policy, as it must learn diverse strategies for each new combination of constraints. To address this, we propose a modular and reconfigurable architecture that adaptively reconfigures network modules based on task requirements. To capture the geometric variability in environments, we extend the contact-based object representation (CORN) to environment geometries, and propose a procedural algorithm for generating diverse environments to train our agent. Taken together, the resulting policy can zero-shot transfer to novel real-world environments and objects despite training entirely within a simulator. We additionally release a simulation-based benchmark featuring nine digital twins of real-world scenes with 353 objects to facilitate non-prehensile manipulation research in realistic domains.
Related papers
- ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation? [17.356760351203715]
This paper introduces ManipGPT, a framework designed to predict optimal interaction areas for articulated objects.
We created a dataset of 9.9k simulated and real images to bridge the sim-to-real gap.
We significantly improved part-level affordance segmentation, adapting the model's in-context segmentation capabilities to robot manipulation scenarios.
arXiv Detail & Related papers (2024-12-13T11:22:01Z) - HACMan++: Spatially-Grounded Motion Primitives for Manipulation [28.411361363637006]
We introduce spatially-grounded parameterized motion primitives in our method HACMan++.
By grounding the primitives on a spatial location in the environment, our method is able to effectively generalize across object shape and pose variations.
Our approach significantly outperforms existing methods, particularly in complex scenarios demanding both high-level sequential reasoning and object generalization.
arXiv Detail & Related papers (2024-07-11T15:10:14Z) - Evaluating Real-World Robot Manipulation Policies in Simulation [91.55267186958892]
Control and visual disparities between real and simulated environments are key challenges for reliable simulated evaluation.
We propose approaches for mitigating these gaps without needing to craft full-fidelity digital twins of real-world environments.
We create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.
arXiv Detail & Related papers (2024-05-09T17:30:16Z) - Task and Domain Adaptive Reinforcement Learning for Robot Control [0.34137115855910755]
We present a novel adaptive agent to dynamically adapt policy in response to different tasks and environmental conditions.
The agent is trained using a custom, highly parallelized simulator built on IsaacGym.
We perform zero-shot transfer to fly the blimp in the real world to solve various tasks.
arXiv Detail & Related papers (2024-04-29T14:02:02Z) - Learning Extrinsic Dexterity with Parameterized Manipulation Primitives [8.7221770019454]
We learn a sequence of actions that utilize the environment to change the object's pose.
Our approach can control the object's state through exploiting interactions between the object, the gripper, and the environment.
We evaluate our approach on picking box-shaped objects of various weight, shape, and friction properties from a constrained table-top workspace.
arXiv Detail & Related papers (2023-10-26T21:28:23Z) - AI planning in the imagination: High-level planning on learned abstract
search spaces [68.75684174531962]
We propose a new method, called PiZero, that gives an agent the ability to plan in an abstract search space that the agent learns during training.
We evaluate our method on multiple domains, including the traveling salesman problem, Sokoban, 2048, the facility location problem, and Pacman.
arXiv Detail & Related papers (2023-08-16T22:47:16Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - Learning to Regrasp by Learning to Place [19.13976401970985]
Regrasping is needed when a robot's current grasp pose fails to perform desired manipulation tasks.
We propose a system for robots to take partial point clouds of an object and the supporting environment as inputs and output a sequence of pick-and-place operations.
We show that our system is able to achieve 73.3% success rate of regrasping diverse objects.
arXiv Detail & Related papers (2021-09-18T03:07:06Z) - CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment.
Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z) - PackIt: A Virtual Environment for Geometric Planning [68.79816936618454]
PackIt is a virtual environment to evaluate and potentially learn the ability to do geometric planning.
We construct a set of challenging packing tasks using an evolutionary algorithm.
arXiv Detail & Related papers (2020-07-21T22:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.