OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion
- URL: http://arxiv.org/abs/2302.13540v1
- Date: Mon, 27 Feb 2023 06:35:03 GMT
- Title: OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion
- Authors: Ruihang Miao, Weizhou Liu, Mingrui Chen, Zheng Gong, Weixin Xu, Chen
Hu, Shuchang Zhou
- Abstract summary: 3D Semantic Scene Completion (SSC) can provide dense geometric and semantic scene representations, which can be applied in the field of autonomous driving and robotic systems.
We propose the first stereo SSC method named OccDepth, which fully exploits implicit depth information from stereo images (or RGBD images) to help the recovery of 3D geometric structures.
- Score: 6.297023466646343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Semantic Scene Completion (SSC) can provide dense geometric and semantic
scene representations, which can be applied in the field of autonomous driving
and robotic systems. It is challenging to estimate the complete geometry and
semantics of a scene solely from visual images, and accurate depth information
is crucial for restoring 3D geometry. In this paper, we propose the first
stereo SSC method named OccDepth, which fully exploits implicit depth
information from stereo images (or RGBD images) to help the recovery of 3D
geometric structures. The Stereo Soft Feature Assignment (Stereo-SFA) module is
proposed to better fuse 3D depth-aware features by implicitly learning the
correlation between stereo images. In particular, when the input are RGBD
image, a virtual stereo images can be generated through original RGB image and
depth map. Besides, the Occupancy Aware Depth (OAD) module is used to obtain
geometry-aware 3D features by knowledge distillation using pre-trained depth
models. In addition, a reformed TartanAir benchmark, named SemanticTartanAir,
is provided in this paper for further testing our OccDepth method on SSC task.
Compared with the state-of-the-art RGB-inferred SSC method, extensive
experiments on SemanticKITTI show that our OccDepth method achieves superior
performance with improving +4.82% mIoU, of which +2.49% mIoU comes from stereo
images and +2.33% mIoU comes from our proposed depth-aware method. Our code and
trained models are available at https://github.com/megvii-research/OccDepth.
Related papers
- SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations.
The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images.
Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z) - Learning Pseudo Front Depth for 2D Forward-Looking Sonar-based
Multi-view Stereo [5.024813922014977]
Retrieving the missing dimension information in acoustic images from 2D forward-looking sonar is a well-known problem in the field of underwater robotics.
We propose a novel learning-based multi-view stereo method to estimate 3D information.
arXiv Detail & Related papers (2022-07-30T14:35:21Z) - Beyond Visual Field of View: Perceiving 3D Environment with Echoes and
Vision [51.385731364529306]
This paper focuses on perceiving and navigating 3D environments using echoes and RGB image.
In particular, we perform depth estimation by fusing RGB image with echoes, received from multiple orientations.
We show that the echoes provide holistic and in-expensive information about the 3D structures complementing the RGB image.
arXiv Detail & Related papers (2022-07-03T22:31:47Z) - DSGN++: Exploiting Visual-Spatial Relation forStereo-based 3D Detectors [60.88824519770208]
Camera-based 3D object detectors are welcome due to their wider deployment and lower price than LiDAR sensors.
We revisit the prior stereo modeling DSGN about the stereo volume constructions for representing both 3D geometry and semantics.
We propose our approach, DSGN++, aiming for improving information flow throughout the 2D-to-3D pipeline.
arXiv Detail & Related papers (2022-04-06T18:43:54Z) - 3D-Aware Indoor Scene Synthesis with Depth Priors [62.82867334012399]
Existing methods fail to model indoor scenes due to the large diversity of room layouts and the objects inside.
We argue that indoor scenes do not have a shared intrinsic structure, and hence only using 2D images cannot adequately guide the model with the 3D geometry.
arXiv Detail & Related papers (2022-02-17T09:54:29Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency
Embedding Space for Autonomous Driving [3.222802562733787]
We propose an efficient and accurate 3D object detection method from stereo images, named RTS3D.
Experiments on the KITTI benchmark show that RTS3D is the first true real-time system for stereo image 3D detection.
arXiv Detail & Related papers (2020-12-30T07:56:37Z) - MeshMVS: Multi-View Stereo Guided Mesh Reconstruction [35.763452474239955]
Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects.
We propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo.
We achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.
arXiv Detail & Related papers (2020-10-17T00:51:21Z) - DSGN: Deep Stereo Geometry Network for 3D Object Detection [79.16397166985706]
There is a large performance gap between image-based and LiDAR-based 3D object detectors.
Our method, called Deep Stereo Geometry Network (DSGN), significantly reduces this gap.
For the first time, we provide a simple and effective one-stage stereo-based 3D detection pipeline.
arXiv Detail & Related papers (2020-01-10T11:44:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.