ASSIST: Interactive Scene Nodes for Scalable and Realistic Indoor
Simulation
- URL: http://arxiv.org/abs/2311.06211v1
- Date: Fri, 10 Nov 2023 17:56:43 GMT
- Title: ASSIST: Interactive Scene Nodes for Scalable and Realistic Indoor
Simulation
- Authors: Zhide Zhong, Jiakai Cao, Songen Gu, Sirui Xie, Weibo Gao, Liyi Luo,
Zike Yan, Hao Zhao, Guyue Zhou
- Abstract summary: We present ASSIST, an object-wise neural radiance field as a panoptic representation for compositional and realistic simulation.
A novel scene node data structure that stores the information of each object in a unified fashion allows online interaction in both intra- and cross-scene settings.
- Score: 17.34617771579733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present ASSIST, an object-wise neural radiance field as a panoptic
representation for compositional and realistic simulation. Central to our
approach is a novel scene node data structure that stores the information of
each object in a unified fashion, allowing online interaction in both intra-
and cross-scene settings. By incorporating a differentiable neural network
along with the associated bounding box and semantic features, the proposed
structure guarantees user-friendly interaction on independent objects to scale
up novel view simulation. Objects in the scene can be queried, added,
duplicated, deleted, transformed, or swapped simply through mouse/keyboard
controls or language instructions. Experiments demonstrate the efficacy of the
proposed method, where scaled realistic simulation can be achieved through
interactive editing and compositional rendering, with color images, depth
images, and panoptic segmentation masks generated in a 3D consistent manner.
Related papers
- LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control [45.1230495980299]
We extend the interactive object reconstruction from single object level to complex scene level.
We propose LiveScene, the first scene-level language-embedded interactive neural radiance field.
LiveScene efficiently reconstructs and controls multiple interactive objects in complex scenes.
arXiv Detail & Related papers (2024-06-23T07:26:13Z) - SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance
Fields [97.63648347686456]
We introduce a novel fine-grained interactive 3D segmentation and editing algorithm with radiance fields, which we refer to as SERF.
Our method entails creating a neural mesh representation by integrating multi-view algorithms with pre-trained 2D models.
Building upon this representation, we introduce a novel surface rendering technique that preserves local information and is robust to deformation.
arXiv Detail & Related papers (2023-12-26T02:50:42Z) - Compositional Human-Scene Interaction Synthesis with Semantic Control [16.93177243590465]
We aim to synthesize humans interacting with a given 3D scene controlled by high-level semantic specifications.
We design a novel transformer-based generative model, in which the articulated 3D human body surface points and 3D objects are jointly encoded.
Inspired by the compositional nature of interactions that humans can simultaneously interact with multiple objects, we define interaction semantics as the composition of varying numbers of atomic action-object pairs.
arXiv Detail & Related papers (2022-07-26T11:37:44Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - Compositional Mixture Representations for Vision and Text [43.2292923754127]
A common representation space between vision and language allows deep networks to relate objects in the image to the corresponding semantic meaning.
We present a model that learns a shared Gaussian mixture representation imposing the compositionality of the text onto the visual domain without having explicit location supervision.
arXiv Detail & Related papers (2022-06-13T18:16:40Z) - Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model
Alignments [81.38641691636847]
We rethink the problem of scene reconstruction from an embodied agent's perspective.
We reconstruct an interactive scene using RGB-D data stream.
This reconstructed scene replaces the object meshes in the dense panoptic map with part-based articulated CAD models.
arXiv Detail & Related papers (2021-03-30T05:56:58Z) - Neural Scene Graphs for Dynamic Scenes [57.65413768984925]
We present the first neural rendering method that decomposes dynamic scenes into scene graphs.
We learn implicitly encoded scenes, combined with a jointly learned latent representation to describe objects with a single implicit function.
arXiv Detail & Related papers (2020-11-20T12:37:10Z) - OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene
Datasets [103.54691385842314]
We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes.
Our goal is to make the dataset creation process widely accessible.
This enables important applications in inverse rendering, scene understanding and robotics.
arXiv Detail & Related papers (2020-07-25T06:48:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.