Phone2Proc: Bringing Robust Robots Into Our Chaotic World
- URL: http://arxiv.org/abs/2212.04819v1
- Date: Thu, 8 Dec 2022 18:52:27 GMT
- Title: Phone2Proc: Bringing Robust Robots Into Our Chaotic World
- Authors: Matt Deitke, Rose Hendrix, Luca Weihs, Ali Farhadi, Kiana Ehsani,
Aniruddha Kembhavi
- Abstract summary: Phone2Proc is a method that uses a 10-minute phone scan and conditional procedural generation to create a distribution of training scenes.
The generated scenes are conditioned on the wall layout and arrangement of large objects from the scan.
Phone2Proc shows massive improvements from 34.7% to 70.7% success rate in sim-to-real ObjectNav performance.
- Score: 50.51598304564075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training embodied agents in simulation has become mainstream for the embodied
AI community. However, these agents often struggle when deployed in the
physical world due to their inability to generalize to real-world environments.
In this paper, we present Phone2Proc, a method that uses a 10-minute phone scan
and conditional procedural generation to create a distribution of training
scenes that are semantically similar to the target environment. The generated
scenes are conditioned on the wall layout and arrangement of large objects from
the scan, while also sampling lighting, clutter, surface textures, and
instances of smaller objects with randomized placement and materials.
Leveraging just a simple RGB camera, training with Phone2Proc shows massive
improvements from 34.7% to 70.7% success rate in sim-to-real ObjectNav
performance across a test suite of over 200 trials in diverse real-world
environments, including homes, offices, and RoboTHOR. Furthermore, Phone2Proc's
diverse distribution of generated scenes makes agents remarkably robust to
changes in the real world, such as human movement, object rearrangement,
lighting changes, or clutter.
Related papers
- ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments [13.988804095409133]
We propose the ReALFRED benchmark that employs real-world scenes, objects, and room layouts to learn agents to complete household tasks.
Specifically, we extend the ALFRED benchmark with updates for larger environmental spaces with smaller visual domain gaps.
With ReALFRED, we analyze previously crafted methods for the ALFRED benchmark and observe that they consistently yield lower performance in all metrics.
arXiv Detail & Related papers (2024-07-26T07:00:27Z) - Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge [93.4434417387526]
We propose Open Vocabulary Mobile Manipulation as a key benchmark task for robotics.
We organized a NeurIPS 2023 competition featuring both simulation and real-world components to evaluate solutions to this task.
We detail the results and methodologies used, both in simulation and real-world settings.
arXiv Detail & Related papers (2024-07-09T15:15:01Z) - RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots [25.650235551519952]
We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments.
We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances.
Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning.
arXiv Detail & Related papers (2024-06-04T17:41:31Z) - Learning to navigate efficiently and precisely in real environments [14.52507964172957]
Embodied AI literature focuses on end-to-end agents trained in simulators like Habitat or AI-Thor.
In this work we explore end-to-end training of agents in simulation in settings which minimize the sim2real gap.
arXiv Detail & Related papers (2024-01-25T17:50:05Z) - HomeRobot: Open-Vocabulary Mobile Manipulation [107.05702777141178]
Open-Vocabulary Mobile Manipulation (OVMM) is the problem of picking any object in any unseen environment, and placing it in a commanded location.
HomeRobot has two components: a simulation component, which uses a large and diverse curated object set in new, high-quality multi-room home environments; and a real-world component, providing a software stack for the low-cost Hello Robot Stretch.
arXiv Detail & Related papers (2023-06-20T14:30:32Z) - CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation [73.78984332354636]
CorNav is a novel zero-shot framework for vision-and-language navigation.
It incorporates environmental feedback for refining future plans and adjusting its actions.
It consistently outperforms all baselines in a zero-shot multi-task setting.
arXiv Detail & Related papers (2023-06-17T11:44:04Z) - QuestEnvSim: Environment-Aware Simulated Motion Tracking from Sparse
Sensors [69.75711933065378]
We show that headset and controller pose can generate realistic full-body poses even in highly constrained environments.
We discuss three features, the environment representation, the contact reward and scene randomization, crucial to the performance of the method.
arXiv Detail & Related papers (2023-06-09T04:40:38Z) - Robot Active Neural Sensing and Planning in Unknown Cluttered
Environments [0.0]
Active sensing and planning in unknown, cluttered environments is an open challenge for robots intending to provide home service, search and rescue, narrow-passage inspection, and medical assistance.
We present the active neural sensing approach that generates the kinematically feasible viewpoint sequences for the robot manipulator with an in-hand camera to gather the minimum number of observations needed to reconstruct the underlying environment.
Our framework actively collects the visual RGBD observations, aggregates them into scene representation, and performs object shape inference to avoid unnecessary robot interactions with the environment.
arXiv Detail & Related papers (2022-08-23T16:56:54Z) - iGibson, a Simulation Environment for Interactive Tasks in Large
Realistic Scenes [54.04456391489063]
iGibson is a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes.
Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects.
iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors.
arXiv Detail & Related papers (2020-12-05T02:14:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.