Modality-Driven Design for Multi-Step Dexterous Manipulation: Insights from Neuroscience
- URL: http://arxiv.org/abs/2412.11337v1
- Date: Sun, 15 Dec 2024 23:05:16 GMT
- Title: Modality-Driven Design for Multi-Step Dexterous Manipulation: Insights from Neuroscience
- Authors: Naoki Wake, Atsushi Kanehira, Daichi Saito, Jun Takamatsu, Kazuhiro Sasabuchi, Hideki Koike, Katsushi Ikeuchi,
- Abstract summary: Multi-step dexterous manipulation is a fundamental skill in household scenarios, yet remains an underexplored area in robotics.
This paper proposes a modular approach, where each step of the manipulation process is addressed with dedicated policies based on effective modality input.
- Score: 14.49331945543691
- License:
- Abstract: Multi-step dexterous manipulation is a fundamental skill in household scenarios, yet remains an underexplored area in robotics. This paper proposes a modular approach, where each step of the manipulation process is addressed with dedicated policies based on effective modality input, rather than relying on a single end-to-end model. To demonstrate this, a dexterous robotic hand performs a manipulation task involving picking up and rotating a box. Guided by insights from neuroscience, the task is decomposed into three sub-skills, 1)reaching, 2)grasping and lifting, and 3)in-hand rotation, based on the dominant sensory modalities employed in the human brain. Each sub-skill is addressed using distinct methods from a practical perspective: a classical controller, a Vision-Language-Action model, and a reinforcement learning policy with force feedback, respectively. We tested the pipeline on a real robot to demonstrate the feasibility of our approach. The key contribution of this study lies in presenting a neuroscience-inspired, modality-driven methodology for multi-step dexterous manipulation.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - Learning a Universal Human Prior for Dexterous Manipulation from Human
Preference [35.54663426598218]
We propose a framework that learns a universal human prior using direct human preference feedback over videos.
A task-agnostic reward model is trained through iteratively generating diverse polices and collecting human preference over the trajectories.
Our method empirically demonstrates more human-like behaviors on robot hands in diverse tasks including even unseen tasks.
arXiv Detail & Related papers (2023-04-10T14:17:33Z) - Zero-Shot Robot Manipulation from Passive Human Videos [59.193076151832145]
We develop a framework for extracting agent-agnostic action representations from human videos.
Our framework is based on predicting plausible human hand trajectories.
We deploy the trained model zero-shot for physical robot manipulation tasks.
arXiv Detail & Related papers (2023-02-03T21:39:52Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and
Heuristic Rule-based Methods for Object Manipulation [118.27432851053335]
This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track.
The No Interaction track targets for learning policies from pre-collected demonstration trajectories.
In this track, we design a Heuristic Rule-based Method (HRM) to trigger high-quality object manipulation by decomposing the task into a series of sub-tasks.
For each sub-task, the simple rule-based controlling strategies are adopted to predict actions that can be applied to robotic arms.
arXiv Detail & Related papers (2022-06-13T16:20:42Z) - Bottom-Up Skill Discovery from Unsegmented Demonstrations for
Long-Horizon Robot Manipulation [55.31301153979621]
We tackle real-world long-horizon robot manipulation tasks through skill discovery.
We present a bottom-up approach to learning a library of reusable skills from unsegmented demonstrations.
Our method has shown superior performance over state-of-the-art imitation learning methods in multi-stage manipulation tasks.
arXiv Detail & Related papers (2021-09-28T16:18:54Z) - Learning by Watching: Physical Imitation of Manipulation Skills from
Human Videos [28.712673809577076]
We present an approach for physical imitation from human videos for robot manipulation tasks.
We design a perception module that learns to translate human videos to the robot domain followed by unsupervised keypoint detection.
We evaluate the effectiveness of our approach on five robot manipulation tasks, including reaching, pushing, sliding, coffee making, and drawer closing.
arXiv Detail & Related papers (2021-01-18T18:50:32Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - Robotic self-representation improves manipulation skills and transfer
learning [14.863872352905629]
We develop a model that learns bidirectional action-effect associations to encode the representations of body schema and the peripersonal space from multisensory information.
We demonstrate that this approach significantly stabilizes the learning-based problem-solving under noisy conditions and that it improves transfer learning of robotic manipulation skills.
arXiv Detail & Related papers (2020-11-13T16:04:58Z) - Understanding Contexts Inside Robot and Human Manipulation Tasks through
a Vision-Language Model and Ontology System in a Video Stream [4.450615100675747]
We present a vision dataset under a strictly constrained knowledge domain for both robot and human manipulations.
We propose a scheme to generate a combination of visual attentions and an evolving knowledge graph filled with commonsense knowledge.
The proposed scheme allows the robot to mimic human-like intentional behaviors by watching real-time videos.
arXiv Detail & Related papers (2020-03-02T19:48:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.