Related papers: iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes

iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes

URL: http://arxiv.org/abs/2012.02924v2
Date: Tue, 8 Dec 2020 02:44:59 GMT
Title: iGibson, a Simulation Environment for Interactive Tasks in Large Realistic Scenes
Authors: Bokui Shen, Fei Xia, Chengshu Li, Roberto Mart\'in-Mart\'in, Linxi Fan, Guanzhi Wang, Shyamal Buch, Claudia D'Arpino, Sanjana Srivastava, Lyne P. Tchapmi, Micael E. Tchapmi, Kent Vainio, Li Fei-Fei, Silvio Savarese
Abstract summary: iGibson is a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes. Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects. iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors.
Score: 54.04456391489063
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present iGibson, a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes. Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects. The scenes are replicas of 3D scanned real-world homes, aligning the distribution of objects and layout to that of the real world. iGibson integrates several key features to facilitate the study of interactive tasks: i) generation of high-quality visual virtual sensor signals (RGB, depth, segmentation, LiDAR, flow, among others), ii) domain randomization to change the materials of the objects (both visual texture and dynamics) and/or their shapes, iii) integrated sampling-based motion planners to generate collision-free trajectories for robot bases and arms, and iv) intuitive human-iGibson interface that enables efficient collection of human demonstrations. Through experiments, we show that the full interactivity of the scenes enables agents to learn useful visual representations that accelerate the training of downstream manipulation tasks. We also show that iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors. iGibson is open-sourced with comprehensive examples and documentation. For more information, visit our project website: http://svl.stanford.edu/igibson/

Related papers

HUMOTO: A 4D Dataset of Mocap Human Object Interactions [27.573065832588554]
Human Motions with Objects is a high-fidelity dataset of human-object interactions for motion generation, computer vision, and robotics applications. Humoto captures interactions with 63 precisely modeled objects and 72 articulated parts. Professional artists rigorously clean and verify each sequence, minimizing foot sliding and object penetrations.
arXiv Detail & Related papers (2025-04-14T16:59:29Z)
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes [90.39860012099393]
Sitcom-Crafter is a system for human motion generation in 3D space. Central to the function generation modules is our novel 3D scene-aware human-human interaction module. Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types.
arXiv Detail & Related papers (2024-10-14T17:56:19Z)
EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone. We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z)
Generating Human Interaction Motions in Scenes with Text Control [66.74298145999909]
We present TeSMo, a method for text-controlled scene-aware motion generation based on denoising diffusion models. Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model. To facilitate training, we embed annotated navigation and interaction motions within scenes.
arXiv Detail & Related papers (2024-04-16T16:04:38Z)
ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative Modeling of Human-Object Interactions [11.32229757116179]
We introduce the ParaHome system, designed to capture dynamic 3D movements of humans and objects within a common home environment. By leveraging the ParaHome system, we collect a novel large-scale dataset of human-object interaction.
arXiv Detail & Related papers (2024-01-18T18:59:58Z)
Revisit Human-Scene Interaction via Space Occupancy [55.67657438543008]
Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks. In this work, we argue that interaction with a scene is essentially interacting with the space occupancy of the scene from an abstract physical perspective. By treating pure motion sequences as records of humans interacting with invisible scene occupancy, we can aggregate motion-only data into a large-scale paired human-occupancy interaction database.
arXiv Detail & Related papers (2023-12-05T12:03:00Z)
Synthesizing Diverse Human Motions in 3D Indoor Scenes [16.948649870341782]
We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner. Existing approaches rely on training sequences that contain captured human motions and the 3D scenes they interact with. We propose a reinforcement learning-based approach that enables virtual humans to navigate in 3D scenes and interact with objects realistically and autonomously.
arXiv Detail & Related papers (2023-05-21T09:22:24Z)
CIRCLE: Capture In Rich Contextual Environments [69.97976304918149]
We propose a novel motion acquisition system in which the actor perceives and operates in a highly contextual virtual world. We present CIRCLE, a dataset containing 10 hours of full-body reaching motion from 5 subjects across nine scenes. We use this dataset to train a model that generates human motion conditioned on scene information.
arXiv Detail & Related papers (2023-03-31T09:18:12Z)
Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments [11.87902527509297]
We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments. Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis.
arXiv Detail & Related papers (2023-01-09T18:59:16Z)
Hindsight for Foresight: Unsupervised Structured Dynamics Models from Physical Interaction [24.72947291987545]
Key challenge for an agent learning to interact with the world is to reason about physical properties of objects. We propose a novel approach for modeling the dynamics of a robot's interactions directly from unlabeled 3D point clouds and images.
arXiv Detail & Related papers (2020-08-02T11:04:49Z)
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation [75.0278287071591]
ThreeDWorld (TDW) is a platform for interactive multi-modal physical simulation. TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments. We present initial experiments enabled by TDW in emerging research directions in computer vision, machine learning, and cognitive science.
arXiv Detail & Related papers (2020-07-09T17:33:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.