Related papers: Habitat 2.0: Training Home Assistants to Rearrange their Habitat

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

URL: http://arxiv.org/abs/2106.14405v1
Date: Mon, 28 Jun 2021 05:42:15 GMT
Title: Habitat 2.0: Training Home Assistants to Rearrange their Habitat
Authors: Andrew Szot, Alex Clegg, Eric Undersander, Erik Wijmans, Yili Zhao, John Turner, Noah Maestre, Mustafa Mukadam, Devendra Chaplot, Oleksandr Maksymets, Aaron Gokaslan, Vladimir Vondrus, Sameer Dharur, Franziska Meier, Wojciech Galuba, Angel Chang, Zsolt Kira, Vladlen Koltun, Jitendra Malik, Manolis Savva, Dhruv Batra
Abstract summary: We introduce Habitat 2.0 (H2.0), a simulation platform for training virtual robots in interactive 3D environments. We make contributions to all levels of the embodied AI stack - data, simulation, and benchmark tasks.
Score: 122.54624752876276
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce Habitat 2.0 (H2.0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios. We make comprehensive contributions to all levels of the embodied AI stack - data, simulation, and benchmark tasks. Specifically, we present: (i) ReplicaCAD: an artist-authored, annotated, reconfigurable 3D dataset of apartments (matching real spaces) with articulated objects (e.g. cabinets and drawers that can open/close); (ii) H2.0: a high-performance physics-enabled 3D simulator with speeds exceeding 25,000 simulation steps per second (850x real-time) on an 8-GPU node, representing 100x speed-ups over prior work; and, (iii) Home Assistant Benchmark (HAB): a suite of common tasks for assistive robots (tidy the house, prepare groceries, set the table) that test a range of mobile manipulation capabilities. These large-scale engineering contributions allow us to systematically compare deep reinforcement learning (RL) at scale and classical sense-plan-act (SPA) pipelines in long-horizon structured tasks, with an emphasis on generalization to new objects, receptacles, and layouts. We find that (1) flat RL policies struggle on HAB compared to hierarchical ones; (2) a hierarchy with independent skills suffers from 'hand-off problems', and (3) SPA pipelines are more brittle than RL policies.

Related papers

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics [54.441878000440965]
Spatial referring is a fundamental capability of embodied robots to interact with the 3D physical world.<n>We propose RoboRefer, a 3D-aware VLM that can first achieve precise spatial understanding.<n>RFT-trained RoboRefer achieves state-of-the-art spatial understanding, with an average success rate of 89.6%.
arXiv Detail & Related papers (2025-06-04T17:59:27Z)
ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement Tasks [18.672482188560622]
MS-HAB is a holistic benchmark for low-level manipulation and in-home object rearrangement. We support realistic low-level control and achieve over 3x the speed of prior magical grasp implementations at a fraction of the GPU memory usage.
arXiv Detail & Related papers (2024-12-09T01:29:24Z)
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots [25.650235551519952]
We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments. We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances. Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning.
arXiv Detail & Related papers (2024-06-04T17:41:31Z)
SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World [46.02807945490169]
We show that imitating shortest-path planners in simulation produces agents that can proficiently navigate, explore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth map or GPS coordinates) This surprising result is enabled by our end-to-end, transformer-based, SPOC architecture, powerful visual encoders paired with extensive image augmentation.
arXiv Detail & Related papers (2023-12-05T18:59:45Z)
ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills [24.150758623016195]
We present ManiSkill2, the next generation of the SAPIEN ManiSkill benchmark for generalizable manipulation skills. ManiSkill2 includes 20 manipulation task families with 2000+ object models and 4M+ demonstration frames. It defines a unified interface and evaluation protocol to support a wide range of algorithms. It empowers fast visual input learning algorithms so that a CNN-based policy can collect samples at about 2000 FPS.
arXiv Detail & Related papers (2023-02-09T14:24:01Z)
Parallel Reinforcement Learning Simulation for Visual Quadrotor Navigation [4.597465975849579]
Reinforcement learning (RL) is an agent-based approach for teaching robots to navigate within the physical world. We present a simulation framework, built on AirSim, which provides efficient parallel training. Building on this framework, Ape-X is modified to incorporate decentralised training of AirSim environments.
arXiv Detail & Related papers (2022-09-22T15:27:42Z)
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments. We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z)
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents [31.499374840833124]
We bring a subset of BEHAVIOR activities into Habitat 2.0 to benefit from its fast simulation speed. Inspired by the catalyzing effect that benchmarks have played in the AI fields, the community is looking for new benchmarks for embodied AI.
arXiv Detail & Related papers (2022-06-13T21:37:31Z)
Megaverse: Simulating Embodied Agents at One Million Experiences per Second [75.1191260838366]
We present Megaverse, a new 3D simulation platform for reinforcement learning and embodied AI research. Megaverse is up to 70x faster than DeepMind Lab in fully-shaded 3D scenes with interactive objects. We use Megaverse to build a new benchmark that consists of several single-agent and multi-agent tasks.
arXiv Detail & Related papers (2021-07-17T03:16:25Z)
The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark for Physically Realistic Embodied AI [96.86091264553613]
We introduce a visually-guided and physics-driven task-and-motion planning benchmark, which we call the ThreeDWorld Transport Challenge. In this challenge, an embodied agent equipped with two 9-DOF articulated arms is spawned randomly in a simulated physical home environment. The agent is required to find a small set of objects scattered around the house, pick them up, and transport them to a desired final location.
arXiv Detail & Related papers (2021-03-25T17:59:08Z)
Large Batch Simulation for Deep Reinforcement Learning [101.01408262583378]
We accelerate deep reinforcement learning-based training in visually complex 3D environments by two orders of magnitude over prior work. We realize end-to-end training speeds of over 19,000 frames of experience per second on a single and up to 72,000 frames per second on a single eight- GPU machine. By combining batch simulation and performance optimizations, we demonstrate that Point navigation agents can be trained in complex 3D environments on a single GPU in 1.5 days to 97% of the accuracy of agents trained on a prior state-of-the-art system.
arXiv Detail & Related papers (2021-03-12T00:22:50Z)
Reactive Long Horizon Task Execution via Visual Skill and Precondition Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner. We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.