Teaching Robots Novel Objects by Pointing at Them
- URL: http://arxiv.org/abs/2012.13620v1
- Date: Fri, 25 Dec 2020 20:01:25 GMT
- Title: Teaching Robots Novel Objects by Pointing at Them
- Authors: Sagar Gubbi Venkatesh and Raviteja Upadrashta and Shishir Kolathaya
and Bharadwaj Amrutur
- Abstract summary: We propose teaching a robot novel objects it has not encountered before by pointing a hand at the new object of interest.
An end-to-end neural network is used to attend to the novel object of interest indicated by the pointing hand and then to localize the object in new scenes.
We show that a robot arm can manipulate novel objects that are highlighted by pointing a hand at them.
- Score: 1.1797787239802762
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Robots that must operate in novel environments and collaborate with humans
must be capable of acquiring new knowledge from human experts during operation.
We propose teaching a robot novel objects it has not encountered before by
pointing a hand at the new object of interest. An end-to-end neural network is
used to attend to the novel object of interest indicated by the pointing hand
and then to localize the object in new scenes. In order to attend to the novel
object indicated by the pointing hand, we propose a spatial attention
modulation mechanism that learns to focus on the highlighted object while
ignoring the other objects in the scene. We show that a robot arm can
manipulate novel objects that are highlighted by pointing a hand at them. We
also evaluate the performance of the proposed architecture on a synthetic
dataset constructed using emojis and on a real-world dataset of common objects.
Related papers
- FOCUS: Object-Centric World Models for Robotics Manipulation [4.6956495676681484]
FOCUS is a model-based agent that learns an object-centric world model.
We show that object-centric world models allow the agent to solve tasks more efficiently.
We also showcase how FOCUS could be adopted in real-world settings.
arXiv Detail & Related papers (2023-07-05T16:49:06Z) - Compositional 3D Human-Object Neural Animation [93.38239238988719]
Human-object interactions (HOIs) are crucial for human-centric scene understanding applications such as human-centric visual generation, AR/VR, and robotics.
In this paper, we address this challenge in HOI animation from a compositional perspective.
We adopt neural human-object deformation to model and render HOI dynamics based on implicit neural representations.
arXiv Detail & Related papers (2023-04-27T10:04:56Z) - Synthesis and Execution of Communicative Robotic Movements with
Generative Adversarial Networks [59.098560311521034]
We focus on how to transfer on two different robotic platforms the same kinematics modulation that humans adopt when manipulating delicate objects.
We choose to modulate the velocity profile adopted by the robots' end-effector, inspired by what humans do when transporting objects with different characteristics.
We exploit a novel Generative Adversarial Network architecture, trained with human kinematics examples, to generalize over them and generate new and meaningful velocity profiles.
arXiv Detail & Related papers (2022-03-29T15:03:05Z) - Bi-directional Object-context Prioritization Learning for Saliency
Ranking [60.62461793691836]
Existing approaches focus on learning either object-object or object-scene relations.
We observe that spatial attention works concurrently with object-based attention in the human visual recognition system.
We propose a novel bi-directional method to unify spatial attention and object-based attention for saliency ranking.
arXiv Detail & Related papers (2022-03-17T16:16:03Z) - DemoGrasp: Few-Shot Learning for Robotic Grasping with Human
Demonstration [42.19014385637538]
We propose to teach a robot how to grasp an object with a simple and short human demonstration.
We first present a small sequence of RGB-D images displaying a human-object interaction.
This sequence is then leveraged to build associated hand and object meshes that represent the interaction.
arXiv Detail & Related papers (2021-12-06T08:17:12Z) - INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [56.00554240240515]
INVIGORATE is a robot system that interacts with human through natural language and grasps a specified object in clutter.
We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping.
We build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules.
arXiv Detail & Related papers (2021-08-25T07:35:21Z) - Property-Aware Robot Object Manipulation: a Generative Approach [57.70237375696411]
In this work, we focus on how to generate robot motion adapted to the hidden properties of the manipulated objects.
We explore the possibility of leveraging Generative Adversarial Networks to synthesize new actions coherent with the properties of the object.
Our results show that Generative Adversarial Nets can be a powerful tool for the generation of novel and meaningful transportation actions.
arXiv Detail & Related papers (2021-06-08T14:15:36Z) - Simultaneous Multi-View Object Recognition and Grasping in Open-Ended
Domains [0.0]
We propose a deep learning architecture with augmented memory capacities to handle open-ended object recognition and grasping simultaneously.
We demonstrate the ability of our approach to grasp never-seen-before objects and to rapidly learn new object categories using very few examples on-site in both simulation and real-world settings.
arXiv Detail & Related papers (2021-06-03T14:12:11Z) - Affordance Transfer Learning for Human-Object Interaction Detection [106.37536031160282]
We introduce an affordance transfer learning approach to jointly detect HOIs with novel objects and recognize affordances.
Specifically, HOI representations can be decoupled into a combination of affordance and object representations.
With the proposed affordance transfer learning, the model is also capable of inferring the affordances of novel objects from known affordance representations.
arXiv Detail & Related papers (2021-04-07T02:37:04Z) - Object Detection and Pose Estimation from RGB and Depth Data for
Real-time, Adaptive Robotic Grasping [0.0]
We propose a system that performs real-time object detection and pose estimation, for the purpose of dynamic robot grasping.
The proposed approach allows the robot to detect the object identity and its actual pose, and then adapt a canonical grasp in order to be used with the new pose.
For training, the system defines a canonical grasp by capturing the relative pose of an object with respect to the gripper attached to the robot's wrist.
During testing, once a new pose is detected, a canonical grasp for the object is identified and then dynamically adapted by adjusting the robot arm's joint angles.
arXiv Detail & Related papers (2021-01-18T22:22:47Z) - One-Shot Object Localization Using Learnt Visual Cues via Siamese
Networks [0.7832189413179361]
In this work, a visual cue is used to specify a novel object of interest which must be localized in new environments.
An end-to-end neural network equipped with a Siamese network is used to learn the cue, infer the object of interest, and then to localize it in new environments.
arXiv Detail & Related papers (2020-12-26T07:40:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.