The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion
Planning Benchmark for Physically Realistic Embodied AI
- URL: http://arxiv.org/abs/2103.14025v1
- Date: Thu, 25 Mar 2021 17:59:08 GMT
- Title: The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion
Planning Benchmark for Physically Realistic Embodied AI
- Authors: Chuang Gan, Siyuan Zhou, Jeremy Schwartz, Seth Alter, Abhishek
Bhandwaldar, Dan Gutfreund, Daniel L.K. Yamins, James J DiCarlo, Josh
McDermott, Antonio Torralba, Joshua B. Tenenbaum
- Abstract summary: We introduce a visually-guided and physics-driven task-and-motion planning benchmark, which we call the ThreeDWorld Transport Challenge.
In this challenge, an embodied agent equipped with two 9-DOF articulated arms is spawned randomly in a simulated physical home environment.
The agent is required to find a small set of objects scattered around the house, pick them up, and transport them to a desired final location.
- Score: 96.86091264553613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a visually-guided and physics-driven task-and-motion planning
benchmark, which we call the ThreeDWorld Transport Challenge. In this
challenge, an embodied agent equipped with two 9-DOF articulated arms is
spawned randomly in a simulated physical home environment. The agent is
required to find a small set of objects scattered around the house, pick them
up, and transport them to a desired final location. We also position containers
around the house that can be used as tools to assist with transporting objects
efficiently. To complete the task, an embodied agent must plan a sequence of
actions to change the state of a large number of objects in the face of
realistic physical constraints. We build this benchmark challenge using the
ThreeDWorld simulation: a virtual 3D environment where all objects respond to
physics, and where can be controlled using fully physics-driven navigation and
interaction API. We evaluate several existing agents on this benchmark.
Experimental results suggest that: 1) a pure RL model struggles on this
challenge; 2) hierarchical planning-based agents can transport some objects but
still far from solving this task. We anticipate that this benchmark will
empower researchers to develop more intelligent physics-driven robots for the
physical world.
Related papers
- M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes [66.44171200767839]
We propose M3Bench, a new benchmark of whole-body motion generation for mobile manipulation tasks.
M3Bench requires an embodied agent to understand its configuration, environmental constraints and task objectives.
M3Bench features 30k object rearrangement tasks across 119 diverse scenes, providing expert demonstrations generated by our newly developed M3BenchMaker.
arXiv Detail & Related papers (2024-10-09T08:38:21Z) - PlaMo: Plan and Move in Rich 3D Physical Environments [68.75982381673869]
We present PlaMo, a scene-aware path planner and a robust physics-based controller.
The planner produces a sequence of motion paths, considering the various limitations the scene imposes on the motion.
Our control policy generates rich and realistic physical motion adhering to the plan.
arXiv Detail & Related papers (2024-06-26T10:41:07Z) - Planning for Complex Non-prehensile Manipulation Among Movable Objects
by Interleaving Multi-Agent Pathfinding and Physics-Based Simulation [23.62057790524675]
Real-world manipulation problems in heavy clutter require robots to reason about potential contacts with objects in the environment.
We focus on pick-and-place style tasks to retrieve a target object from a shelf where some movable' objects must be rearranged in order to solve the task.
In particular, our motivation is to allow the robot to reason over and consider non-prehensile rearrangement actions that lead to complex robot-object and object-object interactions.
arXiv Detail & Related papers (2023-03-23T15:29:27Z) - Out of the Box: Embodied Navigation in the Real World [45.97756658635314]
We show how to transfer knowledge acquired in simulation into the real world.
We deploy our models on a LoCoBot equipped with a single Intel RealSense camera.
Our experiments indicate that it is possible to achieve satisfying results when deploying the obtained model in the real world.
arXiv Detail & Related papers (2021-05-12T18:00:14Z) - GeoSim: Photorealistic Image Simulation with Geometry-Aware Composition [81.24107630746508]
We present GeoSim, a geometry-aware image composition process that synthesizes novel urban driving scenes.
We first build a diverse bank of 3D objects with both realistic geometry and appearance from sensor data.
The resulting synthetic images are photorealistic, traffic-aware, and geometrically consistent, allowing image simulation to scale to complex use cases.
arXiv Detail & Related papers (2021-01-16T23:00:33Z) - Kinematics-Guided Reinforcement Learning for Object-Aware 3D Ego-Pose
Estimation [25.03715978502528]
We propose a method for incorporating object interaction and human body dynamics into the task of 3D ego-pose estimation.
We use a kinematics model of the human body to represent the entire range of human motion, and a dynamics model of the body to interact with objects inside a physics simulator.
This is the first work to estimate a physically valid 3D full-body interaction sequence with objects from egocentric videos.
arXiv Detail & Related papers (2020-11-10T00:06:43Z) - ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation [75.0278287071591]
ThreeDWorld (TDW) is a platform for interactive multi-modal physical simulation.
TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments.
We present initial experiments enabled by TDW in emerging research directions in computer vision, machine learning, and cognitive science.
arXiv Detail & Related papers (2020-07-09T17:33:27Z) - LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World [84.57894492587053]
We develop a novel simulator that captures both the power of physics-based and learning-based simulation.
We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation.
We showcase LiDARsim's usefulness for perception algorithms-testing on long-tail events and end-to-end closed-loop evaluation on safety-critical scenarios.
arXiv Detail & Related papers (2020-06-16T17:44:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.