ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
- URL: http://arxiv.org/abs/2007.04954v2
- Date: Tue, 28 Dec 2021 17:03:21 GMT
- Title: ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation
- Authors: Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin
Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek
Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael
Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund,
David Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh H.
McDermott, Daniel L.K. Yamins
- Abstract summary: ThreeDWorld (TDW) is a platform for interactive multi-modal physical simulation.
TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments.
We present initial experiments enabled by TDW in emerging research directions in computer vision, machine learning, and cognitive science.
- Score: 75.0278287071591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce ThreeDWorld (TDW), a platform for interactive multi-modal
physical simulation. TDW enables simulation of high-fidelity sensory data and
physical interactions between mobile agents and objects in rich 3D
environments. Unique properties include: real-time near-photo-realistic image
rendering; a library of objects and environments, and routines for their
customization; generative procedures for efficiently building classes of new
environments; high-fidelity audio rendering; realistic physical interactions
for a variety of material types, including cloths, liquid, and deformable
objects; customizable agents that embody AI agents; and support for human
interactions with VR devices. TDW's API enables multiple agents to interact
within a simulation and returns a range of sensor and physics data representing
the state of the world. We present initial experiments enabled by TDW in
emerging research directions in computer vision, machine learning, and
cognitive science, including multi-modal physical scene understanding, physical
dynamics predictions, multi-agent interactions, models that learn like a child,
and attention studies in humans and neural networks.
Related papers
- EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - Haptic Repurposing with GenAI [5.424247121310253]
Mixed Reality aims to merge the digital and physical worlds to create immersive human-computer interactions.
This paper introduces Haptic Repurposing with GenAI, an innovative approach to enhance MR interactions by transforming any physical objects into adaptive haptic interfaces for AI-generated virtual assets.
arXiv Detail & Related papers (2024-06-11T13:06:28Z) - Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion [35.71595369663293]
We propose textbfPhysics3D, a novel method for learning various physical properties of 3D objects through a video diffusion model.
Our approach involves designing a highly generalizable physical simulation system based on a viscoelastic material model.
Experiments demonstrate the effectiveness of our method with both elastic and plastic materials.
arXiv Detail & Related papers (2024-06-06T17:59:47Z) - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation [62.53760963292465]
PhysDreamer is a physics-based approach that endows static 3D objects with interactive dynamics.
We present our approach on diverse examples of elastic objects and evaluate the realism of the synthesized interactions through a user study.
arXiv Detail & Related papers (2024-04-19T17:41:05Z) - ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative
Modeling of Human-Object Interactions [11.32229757116179]
We introduce the ParaHome system, designed to capture dynamic 3D movements of humans and objects within a common home environment.
By leveraging the ParaHome system, we collect a novel large-scale dataset of human-object interaction.
arXiv Detail & Related papers (2024-01-18T18:59:58Z) - HSPACE: Synthetic Parametric Humans Animated in Complex Environments [67.8628917474705]
We build a large-scale photo-realistic dataset, Human-SPACE, of animated humans placed in complex indoor and outdoor environments.
We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, in order to generate an initial dataset of over 1 million frames.
Assets are generated automatically, at scale, and are compatible with existing real time rendering and game engines.
arXiv Detail & Related papers (2021-12-23T22:27:55Z) - 3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations.
A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z) - Learning Intuitive Physics with Multimodal Generative Models [24.342994226226786]
This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes.
We use a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces.
We validate through simulated and real-world experiments in which the resting state of an object is predicted from given initial conditions.
arXiv Detail & Related papers (2021-01-12T12:55:53Z) - iGibson, a Simulation Environment for Interactive Tasks in Large
Realistic Scenes [54.04456391489063]
iGibson is a novel simulation environment to develop robotic solutions for interactive tasks in large-scale realistic scenes.
Our environment contains fifteen fully interactive home-sized scenes populated with rigid and articulated objects.
iGibson features enable the generalization of navigation agents, and that the human-iGibson interface and integrated motion planners facilitate efficient imitation learning of simple human demonstrated behaviors.
arXiv Detail & Related papers (2020-12-05T02:14:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.