Related papers: TANGO: Commonsense Generalization in Predicting Tool Interactions for Mobile Manipulators

TANGO: Commonsense Generalization in Predicting Tool Interactions for Mobile Manipulators

URL: http://arxiv.org/abs/2105.04556v1
Date: Wed, 5 May 2021 18:11:57 GMT
Title: TANGO: Commonsense Generalization in Predicting Tool Interactions for Mobile Manipulators
Authors: Shreshth Tuli and Rajas Bansal and Rohan Paul and Mausam
Abstract summary: We introduce TANGO, a novel neural model for predicting task-specific tool interactions. TANGO encodes the world state comprising of objects and symbolic relationships between them using a graph neural network. We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments.
Score: 15.61285199988595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Robots assisting us in factories or homes must learn to make use of objects as tools to perform tasks, e.g., a tray for carrying objects. We consider the problem of learning commonsense knowledge of when a tool may be useful and how its use may be composed with other tools to accomplish a high-level task instructed by a human. We introduce TANGO, a novel neural model for predicting task-specific tool interactions. TANGO is trained using demonstrations obtained from human teachers instructing a virtual robot in a physics simulator. TANGO encodes the world state comprising of objects and symbolic relationships between them using a graph neural network. The model learns to attend over the scene using knowledge of the goal and the action history, finally decoding the symbolic action to execute. Crucially, we address generalization to unseen environments where some known tools are missing, but alternative unseen tools are present. We show that by augmenting the representation of the environment with pre-trained embeddings derived from a knowledge-base, the model can generalize effectively to novel environments. Experimental results show a 60.5-78.9% improvement over the baseline in predicting successful symbolic plans in unseen settings for a simulated mobile manipulator.

Related papers

$π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization [81.73746512639283]
We describe a new model based on $pi_0.5$ that uses co-training on heterogeneous tasks to enable broad generalization. We demonstrate for the first time that an end-to-end learning-enabled robotic system can perform long-horizon and dexterous manipulation skills.
arXiv Detail & Related papers (2025-04-22T17:31:29Z)
Symbolic Learning Enables Self-Evolving Agents [55.625275970720374]
We introduce agent symbolic learning, a systematic framework that enables language agents to optimize themselves on their own. Agent symbolic learning is designed to optimize the symbolic network within language agents by mimicking two fundamental algorithms in connectionist learning. We conduct proof-of-concept experiments on both standard benchmarks and complex real-world tasks.
arXiv Detail & Related papers (2024-06-26T17:59:18Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Learning Generalizable Tool-use Skills through Trajectory Generation [13.879860388944214]
We train a single model on four different deformable object manipulation tasks. The model generalizes to various novel tools, significantly outperforming baselines. We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human.
arXiv Detail & Related papers (2023-09-29T21:32:42Z)
Learning Generalizable Tool Use with Non-rigid Grasp-pose Registration [29.998917158604694]
We present a novel method to enable reinforcement learning of tool use behaviors. Our approach provides a scalable way to learn the operation of tools in a new category using only a single demonstration. The learned policies solve complex tool use tasks and generalize to unseen tools at test time.
arXiv Detail & Related papers (2023-07-31T08:49:11Z)
Tool Learning with Foundation Models [158.8640687353623]
With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans. Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors in this field.
arXiv Detail & Related papers (2023-04-17T15:16:10Z)
Planning for Learning Object Properties [117.27898922118946]
We formalize the problem of automatically training a neural network to recognize object properties as a symbolic planning problem. We use planning techniques to produce a strategy for automating the training dataset creation and the learning process. We provide an experimental evaluation in both a simulated and a real environment.
arXiv Detail & Related papers (2023-01-15T09:37:55Z)
Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation [27.462052737553055]
We present an end-to-end framework to automatically learn tool morphology for contact-rich manipulation tasks by leveraging differentiable physics simulators. In our approach, we instead only need to define the objective with respect to the task performance and enable learning a robust morphology by randomizing the task variations. We demonstrate the effectiveness of our method for designing new tools in several scenarios such as winding ropes, flipping a box and pushing peas onto a scoop in simulation.
arXiv Detail & Related papers (2022-11-04T00:57:36Z)
How to select and use tools? : Active Perception of Target Objects Using Multimodal Deep Learning [9.677391628613025]
We focus on active perception using multimodal sensorimotor data while a robot interacts with objects. We construct a deep neural networks (DNN) model that learns to recognize object characteristics. We also examine the contributions of images, force, and tactile data and show that learning a variety of multimodal information results in rich perception for tool use.
arXiv Detail & Related papers (2021-06-04T12:49:30Z)
myGym: Modular Toolkit for Visuomotor Robotic Tasks [0.0]
myGym is a novel virtual robotic toolkit developed for reinforcement learning (RL), intrinsic motivation and imitation learning tasks trained in a 3D simulator. The modular structure of the simulator enables users to train and validate their algorithms on a large number of scenarios with various robots, environments and tasks. The toolkit provides pretrained visual modules for visuomotor tasks allowing rapid prototyping, and, moreover, users can customize the visual submodules and retrain with their own set of objects.
arXiv Detail & Related papers (2020-12-21T19:15:05Z)
Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots. We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector. We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.