PanoContext-Former: Panoramic Total Scene Understanding with a
Transformer
- URL: http://arxiv.org/abs/2305.12497v2
- Date: Mon, 5 Jun 2023 04:43:41 GMT
- Title: PanoContext-Former: Panoramic Total Scene Understanding with a
Transformer
- Authors: Yuan Dong, Chuan Fang, Liefeng Bo, Zilong Dong, Ping Tan
- Abstract summary: Panoramic image enables deeper understanding and more holistic perception of $360circ$ surrounding environment.
In this paper, we propose a novel method using depth prior for holistic indoor scene understanding.
In addition, we introduce a real-world dataset for scene understanding, including photo-realistic panoramas, high-fidelity depth images, accurately annotated room layouts, and oriented object bounding boxes and shapes.
- Score: 37.51637352106841
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Panoramic image enables deeper understanding and more holistic perception of
$360^\circ$ surrounding environment, which can naturally encode enriched scene
context information compared to standard perspective image. Previous work has
made lots of effort to solve the scene understanding task in a bottom-up form,
thus each sub-task is processed separately and few correlations are explored in
this procedure. In this paper, we propose a novel method using depth prior for
holistic indoor scene understanding which recovers the objects' shapes,
oriented bounding boxes and the 3D room layout simultaneously from a single
panorama. In order to fully utilize the rich context information, we design a
transformer-based context module to predict the representation and relationship
among each component of the scene. In addition, we introduce a real-world
dataset for scene understanding, including photo-realistic panoramas,
high-fidelity depth images, accurately annotated room layouts, and oriented
object bounding boxes and shapes. Experiments on the synthetic and real-world
datasets demonstrate that our method outperforms previous panoramic scene
understanding methods in terms of both layout estimation and 3D object
detection.
Related papers
- Object-level Scene Deocclusion [92.39886029550286]
We present a new self-supervised PArallel visible-to-COmplete diffusion framework, named PACO, for object-level scene deocclusion.
To train PACO, we create a large-scale dataset with 500k samples to enable self-supervised learning.
Experiments on COCOA and various real-world scenes demonstrate the superior capability of PACO for scene deocclusion, surpassing the state of the arts by a large margin.
arXiv Detail & Related papers (2024-06-11T20:34:10Z) - PanoViT: Vision Transformer for Room Layout Estimation from a Single
Panoramic Image [11.053777620735175]
PanoViT is a panorama vision transformer to estimate the room layout from a single panoramic image.
Compared to CNN models, our PanoViT is more proficient in learning global information from the panoramic image.
Our method outperforms state-of-the-art solutions in room layout prediction accuracy.
arXiv Detail & Related papers (2022-12-23T05:37:11Z) - Scene Representation Transformer: Geometry-Free Novel View Synthesis
Through Set-Latent Scene Representations [48.05445941939446]
A classical problem in computer vision is to infer a 3D scene representation from few images that can be used to render novel views at interactive rates.
We propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area.
We show that this method outperforms recent baselines in terms of PSNR and speed on synthetic datasets.
arXiv Detail & Related papers (2021-11-25T16:18:56Z) - DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene
Context Graph and Relation-based Optimization [66.25948693095604]
We propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image.
Experiments demonstrate that our method outperforms existing methods on panoramic scene understanding in terms of both geometry accuracy and object arrangement.
arXiv Detail & Related papers (2021-08-24T13:55:29Z) - IBRNet: Learning Multi-View Image-Based Rendering [67.15887251196894]
We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views.
By drawing on source views at render time, our method hearkens back to classic work on image-based rendering.
arXiv Detail & Related papers (2021-02-25T18:56:21Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z) - Perspective Plane Program Induction from a Single Image [85.28956922100305]
We study the inverse graphics problem of inferring a holistic representation for natural images.
We formulate this problem as jointly finding the camera pose and scene structure that best describe the input image.
Our proposed framework, Perspective Plane Program Induction (P3I), combines search-based and gradient-based algorithms to efficiently solve the problem.
arXiv Detail & Related papers (2020-06-25T21:18:58Z) - Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction
for Indoor Scenes from a Single Image [24.99186733297264]
We propose an end-to-end solution to jointly reconstruct room layout, object bounding boxes and meshes from a single image.
Our method builds upon a holistic scene context and proposes a coarse-to-fine hierarchy with three components.
Experiments on the SUN RGB-D and Pix3D datasets demonstrate that our method consistently outperforms existing methods.
arXiv Detail & Related papers (2020-02-27T16:00:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.