Related papers: Learning secondary tool affordances of human partners using iCub robot's egocentric data

Learning secondary tool affordances of human partners using iCub robot's egocentric data

URL: http://arxiv.org/abs/2407.11922v1
Date: Tue, 16 Jul 2024 17:14:13 GMT
Title: Learning secondary tool affordances of human partners using iCub robot's egocentric data
Authors: Bosong Ding, Erhan Oztop, Giacomo Spigler, Murat Kirtay,
Abstract summary: We address the problem of learning the secondary tool affordances of human partners. We use the iCub robot to observe human partners with three cameras while they perform actions on twenty objects using four different tools. Our results indicate that deep learning architectures enable the iCub robot to predict secondary tool affordances.
Score: 2.583237671350984
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Objects, in particular tools, provide several action possibilities to the agents that can act on them, which are generally associated with the term of affordances. A tool is typically designed for a specific purpose, such as driving a nail in the case of a hammer, which we call as the primary affordance. A tool can also be used beyond its primary purpose, in which case we can associate this auxiliary use with the term secondary affordance. Previous work on affordance perception and learning has been mostly focused on primary affordances. Here, we address the less explored problem of learning the secondary tool affordances of human partners. To do this, we use the iCub robot to observe human partners with three cameras while they perform actions on twenty objects using four different tools. In our experiments, human partners utilize tools to perform actions that do not correspond to their primary affordances. For example, the iCub robot observes a human partner using a ruler for pushing, pulling, and moving objects instead of measuring their lengths. In this setting, we constructed a dataset by taking images of objects before and after each action is executed. We then model learning secondary affordances by training three neural networks (ResNet-18, ResNet-50, and ResNet-101) each on three tasks, using raw images showing the `initial' and `final' position of objects as input: (1) predicting the tool used to move an object, (2) predicting the tool used with an additional categorical input that encoded the action performed, and (3) joint prediction of both tool used and action performed. Our results indicate that deep learning architectures enable the iCub robot to predict secondary tool affordances, thereby paving the road for human-robot collaborative object manipulation involving complex affordances.

Related papers

Hand-Object Interaction Pretraining from Videos [77.92637809322231]
We learn general robot manipulation priors from 3D hand-object interaction trajectories. We do so by sharing both the human hand and the manipulated object in 3D space and human motions to robot actions. We empirically demonstrate that finetuning this policy, with both reinforcement learning (RL) and behavior cloning (BC), enables sample-efficient adaptation to downstream tasks and simultaneously improves robustness and generalizability compared to prior approaches.
arXiv Detail & Related papers (2024-09-12T17:59:07Z)
Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Grasping in Dexterous Robotics [27.124273762587848]
Affordance features of objects serve as a bridge in the functional interaction between agents and objects. We propose a granularity-aware affordance feature extraction method for locating functional affordance areas. We also use highly activated coarse-grained affordance features in hand-object interaction regions to predict grasp gestures. This forms a complete dexterous robotic functional grasping framework GAAF-Dex.
arXiv Detail & Related papers (2024-06-30T07:42:57Z)
Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models. Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions. We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z)
Learning Generalizable Tool-use Skills through Trajectory Generation [13.879860388944214]
We train a single model on four different deformable object manipulation tasks. The model generalizes to various novel tools, significantly outperforming baselines. We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human.
arXiv Detail & Related papers (2023-09-29T21:32:42Z)
Learning Generalizable Tool Use with Non-rigid Grasp-pose Registration [29.998917158604694]
We present a novel method to enable reinforcement learning of tool use behaviors. Our approach provides a scalable way to learn the operation of tools in a new category using only a single demonstration. The learned policies solve complex tool use tasks and generalize to unseen tools at test time.
arXiv Detail & Related papers (2023-07-31T08:49:11Z)
Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z)
How to select and use tools? : Active Perception of Target Objects Using Multimodal Deep Learning [9.677391628613025]
We focus on active perception using multimodal sensorimotor data while a robot interacts with objects. We construct a deep neural networks (DNN) model that learns to recognize object characteristics. We also examine the contributions of images, force, and tactile data and show that learning a variety of multimodal information results in rich perception for tool use.
arXiv Detail & Related papers (2021-06-04T12:49:30Z)
TANGO: Commonsense Generalization in Predicting Tool Interactions for Mobile Manipulators [15.61285199988595]
We introduce TANGO, a novel neural model for predicting task-specific tool interactions. TANGO encodes the world state comprising of objects and symbolic relationships between them using a graph neural network. We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments.
arXiv Detail & Related papers (2021-05-05T18:11:57Z)
Learning Dexterous Grasping with Object-Centric Visual Affordances [86.49357517864937]
Dexterous robotic hands are appealing for their agility and human-like morphology. We introduce an approach for learning dexterous grasping. Our key idea is to embed an object-centric visual affordance model within a deep reinforcement learning loop.
arXiv Detail & Related papers (2020-09-03T04:00:40Z)
Human Grasp Classification for Reactive Human-to-Robot Handovers [50.91803283297065]
We propose an approach for human-to-robot handovers in which the robot meets the human halfway. We collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses. We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position.
arXiv Detail & Related papers (2020-03-12T19:58:03Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.