Related papers: How to select and use tools? : Active Perception of Target Objects Using Multimodal Deep Learning

How to select and use tools? : Active Perception of Target Objects Using Multimodal Deep Learning

URL: http://arxiv.org/abs/2106.02445v1
Date: Fri, 4 Jun 2021 12:49:30 GMT
Title: How to select and use tools? : Active Perception of Target Objects Using Multimodal Deep Learning
Authors: Namiko Saito, Tetsuya Ogata, Satoshi Funabashi, Hiroki Mori and Shigeki Sugano
Abstract summary: We focus on active perception using multimodal sensorimotor data while a robot interacts with objects. We construct a deep neural networks (DNN) model that learns to recognize object characteristics. We also examine the contributions of images, force, and tactile data and show that learning a variety of multimodal information results in rich perception for tool use.
Score: 9.677391628613025
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Selection of appropriate tools and use of them when performing daily tasks is a critical function for introducing robots for domestic applications. In previous studies, however, adaptability to target objects was limited, making it difficult to accordingly change tools and adjust actions. To manipulate various objects with tools, robots must both understand tool functions and recognize object characteristics to discern a tool-object-action relation. We focus on active perception using multimodal sensorimotor data while a robot interacts with objects, and allow the robot to recognize their extrinsic and intrinsic characteristics. We construct a deep neural networks (DNN) model that learns to recognize object characteristics, acquires tool-object-action relations, and generates motions for tool selection and handling. As an example tool-use situation, the robot performs an ingredients transfer task, using a turner or ladle to transfer an ingredient from a pot to a bowl. The results confirm that the robot recognizes object characteristics and servings even when the target ingredients are unknown. We also examine the contributions of images, force, and tactile data and show that learning a variety of multimodal information results in rich perception for tool use.

Related papers

Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction [52.12746368727368]
Differentiable simulation has become a powerful tool for system identification. Our approach calibrates object properties by using information from the robot, without relying on data from the object itself. We demonstrate the effectiveness of our method on a low-cost robotic platform.
arXiv Detail & Related papers (2024-10-04T20:48:38Z)
A Survey of Embodied Learning for Object-Centric Robotic Manipulation [27.569063968870868]
Embodied learning for object-centric robotic manipulation is a rapidly developing and challenging area in AI. Unlike data-driven machine learning methods, embodied learning focuses on robot learning through physical interaction with the environment.
arXiv Detail & Related papers (2024-08-21T11:32:09Z)
Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z)
Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation [27.462052737553055]
We present an end-to-end framework to automatically learn tool morphology for contact-rich manipulation tasks by leveraging differentiable physics simulators. In our approach, we instead only need to define the objective with respect to the task performance and enable learning a robust morphology by randomizing the task variations. We demonstrate the effectiveness of our method for designing new tools in several scenarios such as winding ropes, flipping a box and pushing peas onto a scoop in simulation.
arXiv Detail & Related papers (2022-11-04T00:57:36Z)
Deep Active Visual Attention for Real-time Robot Motion Generation: Emergence of Tool-body Assimilation and Adaptive Tool-use [9.141661467673817]
This paper proposes a novel robot motion generation model, inspired by a human cognitive structure. The model incorporates a state-driven active top-down visual attention module, which acquires attentions that can actively change targets based on task states. The results suggested an improvement of flexibility in model's visual perception, which sustained stable attention and motion even if it was provided with untrained tools or exposed to experimenter's distractions.
arXiv Detail & Related papers (2022-06-29T10:55:32Z)
Synthesis and Execution of Communicative Robotic Movements with Generative Adversarial Networks [59.098560311521034]
We focus on how to transfer on two different robotic platforms the same kinematics modulation that humans adopt when manipulating delicate objects. We choose to modulate the velocity profile adopted by the robots' end-effector, inspired by what humans do when transporting objects with different characteristics. We exploit a novel Generative Adversarial Network architecture, trained with human kinematics examples, to generalize over them and generate new and meaningful velocity profiles.
arXiv Detail & Related papers (2022-03-29T15:03:05Z)
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [56.00554240240515]
INVIGORATE is a robot system that interacts with human through natural language and grasps a specified object in clutter. We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping. We build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules.
arXiv Detail & Related papers (2021-08-25T07:35:21Z)
Property-Aware Robot Object Manipulation: a Generative Approach [57.70237375696411]
In this work, we focus on how to generate robot motion adapted to the hidden properties of the manipulated objects. We explore the possibility of leveraging Generative Adversarial Networks to synthesize new actions coherent with the properties of the object. Our results show that Generative Adversarial Nets can be a powerful tool for the generation of novel and meaningful transportation actions.
arXiv Detail & Related papers (2021-06-08T14:15:36Z)
TANGO: Commonsense Generalization in Predicting Tool Interactions for Mobile Manipulators [15.61285199988595]
We introduce TANGO, a novel neural model for predicting task-specific tool interactions. TANGO encodes the world state comprising of objects and symbolic relationships between them using a graph neural network. We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments.
arXiv Detail & Related papers (2021-05-05T18:11:57Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.