How to select and use tools? : Active Perception of Target Objects Using
Multimodal Deep Learning
- URL: http://arxiv.org/abs/2106.02445v1
- Date: Fri, 4 Jun 2021 12:49:30 GMT
- Title: How to select and use tools? : Active Perception of Target Objects Using
Multimodal Deep Learning
- Authors: Namiko Saito, Tetsuya Ogata, Satoshi Funabashi, Hiroki Mori and
Shigeki Sugano
- Abstract summary: We focus on active perception using multimodal sensorimotor data while a robot interacts with objects.
We construct a deep neural networks (DNN) model that learns to recognize object characteristics.
We also examine the contributions of images, force, and tactile data and show that learning a variety of multimodal information results in rich perception for tool use.
- Score: 9.677391628613025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Selection of appropriate tools and use of them when performing daily tasks is
a critical function for introducing robots for domestic applications. In
previous studies, however, adaptability to target objects was limited, making
it difficult to accordingly change tools and adjust actions. To manipulate
various objects with tools, robots must both understand tool functions and
recognize object characteristics to discern a tool-object-action relation. We
focus on active perception using multimodal sensorimotor data while a robot
interacts with objects, and allow the robot to recognize their extrinsic and
intrinsic characteristics. We construct a deep neural networks (DNN) model that
learns to recognize object characteristics, acquires tool-object-action
relations, and generates motions for tool selection and handling. As an example
tool-use situation, the robot performs an ingredients transfer task, using a
turner or ladle to transfer an ingredient from a pot to a bowl. The results
confirm that the robot recognizes object characteristics and servings even when
the target ingredients are unknown. We also examine the contributions of
images, force, and tactile data and show that learning a variety of multimodal
information results in rich perception for tool use.
Related papers
- Learning secondary tool affordances of human partners using iCub robot's egocentric data [2.583237671350984]
We address the problem of learning the secondary tool affordances of human partners.
We use the iCub robot to observe human partners with three cameras while they perform actions on twenty objects using four different tools.
Our results indicate that deep learning architectures enable the iCub robot to predict secondary tool affordances.
arXiv Detail & Related papers (2024-07-16T17:14:13Z) - Interactive Learning of Physical Object Properties Through Robot Manipulation and Database of Object Measurements [20.301193437161867]
The framework involves exploratory action selection to maximize learning about objects on a table.
A robot pipeline integrates with a logging module and an online database of objects, containing over 24,000 measurements of 63 objects with different grippers.
arXiv Detail & Related papers (2024-04-10T20:59:59Z) - Creative Robot Tool Use with Large Language Models [47.11935262923095]
This paper investigates the feasibility of imbuing robots with the ability to creatively use tools in tasks that involve implicit physical constraints and long-term planning.
We develop RoboTool, a system that accepts natural language instructions and outputs executable code for controlling robots in both simulated and real-world environments.
arXiv Detail & Related papers (2023-10-19T18:02:15Z) - Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies.
The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z) - Learning Tool Morphology for Contact-Rich Manipulation Tasks with
Differentiable Simulation [27.462052737553055]
We present an end-to-end framework to automatically learn tool morphology for contact-rich manipulation tasks by leveraging differentiable physics simulators.
In our approach, we instead only need to define the objective with respect to the task performance and enable learning a robust morphology by randomizing the task variations.
We demonstrate the effectiveness of our method for designing new tools in several scenarios such as winding ropes, flipping a box and pushing peas onto a scoop in simulation.
arXiv Detail & Related papers (2022-11-04T00:57:36Z) - Deep Active Visual Attention for Real-time Robot Motion Generation:
Emergence of Tool-body Assimilation and Adaptive Tool-use [9.141661467673817]
This paper proposes a novel robot motion generation model, inspired by a human cognitive structure.
The model incorporates a state-driven active top-down visual attention module, which acquires attentions that can actively change targets based on task states.
The results suggested an improvement of flexibility in model's visual perception, which sustained stable attention and motion even if it was provided with untrained tools or exposed to experimenter's distractions.
arXiv Detail & Related papers (2022-06-29T10:55:32Z) - Synthesis and Execution of Communicative Robotic Movements with
Generative Adversarial Networks [59.098560311521034]
We focus on how to transfer on two different robotic platforms the same kinematics modulation that humans adopt when manipulating delicate objects.
We choose to modulate the velocity profile adopted by the robots' end-effector, inspired by what humans do when transporting objects with different characteristics.
We exploit a novel Generative Adversarial Network architecture, trained with human kinematics examples, to generalize over them and generate new and meaningful velocity profiles.
arXiv Detail & Related papers (2022-03-29T15:03:05Z) - INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [56.00554240240515]
INVIGORATE is a robot system that interacts with human through natural language and grasps a specified object in clutter.
We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping.
We build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules.
arXiv Detail & Related papers (2021-08-25T07:35:21Z) - Property-Aware Robot Object Manipulation: a Generative Approach [57.70237375696411]
In this work, we focus on how to generate robot motion adapted to the hidden properties of the manipulated objects.
We explore the possibility of leveraging Generative Adversarial Networks to synthesize new actions coherent with the properties of the object.
Our results show that Generative Adversarial Nets can be a powerful tool for the generation of novel and meaningful transportation actions.
arXiv Detail & Related papers (2021-06-08T14:15:36Z) - TANGO: Commonsense Generalization in Predicting Tool Interactions for
Mobile Manipulators [15.61285199988595]
We introduce TANGO, a novel neural model for predicting task-specific tool interactions.
TANGO encodes the world state comprising of objects and symbolic relationships between them using a graph neural network.
We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments.
arXiv Detail & Related papers (2021-05-05T18:11:57Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.