AssetField: Assets Mining and Reconfiguration in Ground Feature Plane
Representation
- URL: http://arxiv.org/abs/2303.13953v1
- Date: Fri, 24 Mar 2023 12:18:10 GMT
- Title: AssetField: Assets Mining and Reconfiguration in Ground Feature Plane
Representation
- Authors: Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Bo Dai, Dahua
Lin
- Abstract summary: AssetField is a novel neural scene representation that learns a set of object-aware ground feature planes to represent the scene.
We show that AssetField achieves competitive performance for novel-view synthesis and generates realistic renderings for new scene configurations.
- Score: 111.59786941545774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Both indoor and outdoor environments are inherently structured and
repetitive. Traditional modeling pipelines keep an asset library storing unique
object templates, which is both versatile and memory efficient in practice.
Inspired by this observation, we propose AssetField, a novel neural scene
representation that learns a set of object-aware ground feature planes to
represent the scene, where an asset library storing template feature patches
can be constructed in an unsupervised manner. Unlike existing methods which
require object masks to query spatial points for object editing, our ground
feature plane representation offers a natural visualization of the scene in the
bird-eye view, allowing a variety of operations (e.g. translation, duplication,
deformation) on objects to configure a new scene. With the template feature
patches, group editing is enabled for scenes with many recurring items to avoid
repetitive work on object individuals. We show that AssetField not only
achieves competitive performance for novel-view synthesis but also generates
realistic renderings for new scene configurations.
Related papers
- Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments [44.6372390798904]
We propose a new task denominated Personalized Instance-based Navigation (PIN), in which an embodied agent is tasked with locating and reaching a specific personal object.
In each episode, the target object is presented to the agent using two modalities: a set of visual reference images on a neutral background and manually annotated textual descriptions.
arXiv Detail & Related papers (2024-10-23T18:01:09Z) - Object-level Scene Deocclusion [92.39886029550286]
We present a new self-supervised PArallel visible-to-COmplete diffusion framework, named PACO, for object-level scene deocclusion.
To train PACO, we create a large-scale dataset with 500k samples to enable self-supervised learning.
Experiments on COCOA and various real-world scenes demonstrate the superior capability of PACO for scene deocclusion, surpassing the state of the arts by a large margin.
arXiv Detail & Related papers (2024-06-11T20:34:10Z) - MOST: Multiple Object localization with Self-supervised Transformers for
object discovery [97.47075050779085]
We present Multiple Object localization with Self-supervised Transformers (MOST)
MOST uses features of transformers trained using self-supervised learning to localize multiple objects in real world images.
We show MOST can be used for self-supervised pre-training of object detectors, and yields consistent improvements on fully, semi-supervised object detection and unsupervised region proposal generation.
arXiv Detail & Related papers (2023-04-11T17:57:27Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - Scene-level Tracking and Reconstruction without Object Priors [14.068026331380844]
We present the first real-time system capable of tracking and reconstructing, individually, every visible object in a given scene.
Our proposed system can provide the live geometry and deformation of all visible objects in a novel scene in real-time.
arXiv Detail & Related papers (2022-10-07T20:56:14Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z) - TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and
Reconstruction [57.1209039399599]
We propose a map representation that allows maintaining a single volume for the entire scene and all the objects therein.
In a multiple dynamic object tracking and reconstruction scenario, our representation allows maintaining accurate reconstruction of surfaces even while they become temporarily occluded by other objects moving in their proximity.
We evaluate the proposed TSDF++ formulation on a public synthetic dataset and demonstrate its ability to preserve reconstructions of occluded surfaces when compared to the standard TSDF map representation.
arXiv Detail & Related papers (2021-05-16T16:15:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.