Related papers: NeuralGrasps: Learning Implicit Representations for Grasps of Multiple Robotic Hands

NeuralGrasps: Learning Implicit Representations for Grasps of Multiple Robotic Hands

URL: http://arxiv.org/abs/2207.02959v1
Date: Wed, 6 Jul 2022 20:33:32 GMT
Title: NeuralGrasps: Learning Implicit Representations for Grasps of Multiple Robotic Hands
Authors: Ninad Khargonkar, Neil Song, Zesheng Xu, Balakrishnan Prabhakaran, Yu Xiang
Abstract summary: We introduce a neural implicit representation for grasps of objects from multiple robotic hands. Different grasps across multiple robotic hands are encoded into a shared latent space. grasp transfer has the potential to share grasping skills between robots and enable robots to learn grasping skills from humans.
Score: 15.520158510964757
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a neural implicit representation for grasps of objects from multiple robotic hands. Different grasps across multiple robotic hands are encoded into a shared latent space. Each latent vector is learned to decode to the 3D shape of an object and the 3D shape of a robotic hand in a grasping pose in terms of the signed distance functions of the two 3D shapes. In addition, the distance metric in the latent space is learned to preserve the similarity between grasps across different robotic hands, where the similarity of grasps is defined according to contact regions of the robotic hands. This property enables our method to transfer grasps between different grippers including a human hand, and grasp transfer has the potential to share grasping skills between robots and enable robots to learn grasping skills from humans. Furthermore, the encoded signed distance functions of objects and grasps in our implicit representation can be used for 6D object pose estimation with grasping contact optimization from partial point clouds, which enables robotic grasping in the real world.

Related papers

VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation [53.63540587160549]
VidBot is a framework enabling zero-shot robotic manipulation using learned 3D affordance from in-the-wild monocular RGB-only human videos. VidBot paves the way for leveraging everyday human videos to make robot learning more scalable.
arXiv Detail & Related papers (2025-03-10T10:04:58Z)
AnyDexGrasp: General Dexterous Grasping for Different Hands with Human-level Learning Efficiency [49.868970174484204]
We introduce an efficient approach for learning dexterous grasping with minimal data. Our method achieves high performance with human-level learning efficiency: only hundreds of grasp attempts on 40 training objects. This method demonstrates promising applications for humanoid robots, prosthetics, and other domains requiring robust, versatile robotic manipulation.
arXiv Detail & Related papers (2025-02-23T03:26:06Z)
Learning to Transfer Human Hand Skills for Robot Manipulations [12.797862020095856]
We present a method for teaching dexterous manipulation tasks to robots from human hand motion demonstrations. Our approach learns a joint motion manifold that maps human hand movements, robot hand actions, and object movements in 3D, enabling us to infer one motion from others.
arXiv Detail & Related papers (2025-01-07T22:33:47Z)
Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction [51.49400490437258]
This work develops a method for imitating articulated object manipulation from a single monocular RGB human demonstration. We first propose 4D Differentiable Part Models (4D-DPM), a method for recovering 3D part motion from a monocular video. Given this 4D reconstruction, the robot replicates object trajectories by planning bimanual arm motions that induce the demonstrated object part motion. We evaluate 4D-DPM's 3D tracking accuracy on ground truth annotated 3D part trajectories and RSRD's physical execution performance on 9 objects across 10 trials each on a bimanual YuMi robot.
arXiv Detail & Related papers (2024-09-26T17:57:16Z)
3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects [13.58353565350936]
We contribute methodology to jointly estimate the geometry and pose of objects grasped by a robot. Our method transforms the estimated geometry into the robot's coordinate frame. We empirically evaluate our approach on a robot manipulator holding a diverse set of real-world objects.
arXiv Detail & Related papers (2024-07-14T21:02:55Z)
SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR. SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds. We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z)
Exploring 3D Human Pose Estimation and Forecasting from the Robot's Perspective: The HARPER Dataset [52.22758311559]
We introduce HARPER, a novel dataset for 3D body pose estimation and forecast in dyadic interactions between users and Spot. The key-novelty is the focus on the robot's perspective, i.e., on the data captured by the robot's sensors. The scenario underlying HARPER includes 15 actions, of which 10 involve physical contact between the robot and users.
arXiv Detail & Related papers (2024-03-21T14:53:50Z)
Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior. Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z)
DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects [8.195608430584073]
We propose a new benchmark called DexArt, which involves Dexterous manipulation with Articulated objects in a physical simulator. Our main focus is to evaluate the generalizability of the learned policy on unseen articulated objects. We use Reinforcement Learning with 3D representation learning to achieve generalization.
arXiv Detail & Related papers (2023-05-09T18:30:58Z)
Learning 6-DoF Fine-grained Grasp Detection Based on Part Affordance Grounding [42.04502185508723]
We propose a new large Language-guided SHape grAsPing datasEt to promote 3D part-level affordance and grasping ability learning. From the perspective of robotic cognition, we design a two-stage fine-grained robotic grasping framework (named LangPartGPD) Our method combines the advantages of human-robot collaboration and large language models (LLMs) Results show our method achieves competitive performance in 3D geometry fine-grained grounding, object affordance inference, and 3D part-aware grasping tasks.
arXiv Detail & Related papers (2023-01-27T07:00:54Z)
Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z)
DexTransfer: Real World Multi-fingered Dexterous Grasping with Minimal Human Demonstrations [51.87067543670535]
We propose a robot-learning system that can take a small number of human demonstrations and learn to grasp unseen object poses. We train a dexterous grasping policy that takes the point clouds of the object as input and predicts continuous actions to grasp objects from different initial robot states. The policy learned from our dataset can generalize well on unseen object poses in both simulation and the real world.
arXiv Detail & Related papers (2022-09-28T17:51:49Z)
Grasping Field: Learning Implicit Representations for Human Grasps [16.841780141055505]
We propose an expressive representation for human grasp modelling that is efficient and easy to integrate with deep neural networks. We name this 3D to 2D mapping as Grasping Field, parameterize it with a deep neural network, and learn it from data. Our generative model is able to synthesize high-quality human grasps, given only on a 3D object point cloud.
arXiv Detail & Related papers (2020-08-10T23:08:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.