Disentangling and Vectorization: A 3D Visual Perception Approach for
Autonomous Driving Based on Surround-View Fisheye Cameras
- URL: http://arxiv.org/abs/2107.08862v1
- Date: Mon, 19 Jul 2021 13:24:21 GMT
- Title: Disentangling and Vectorization: A 3D Visual Perception Approach for
Autonomous Driving Based on Surround-View Fisheye Cameras
- Authors: Zizhang Wu, Wenkai Zhang, Jizheng Wang, Man Wang, Yuanzhu Gan, Xinchao
Gou, Muqing Fang, Jing Song
- Abstract summary: Multidimensional Vector is proposed to include the utilizable information generated in different dimensions and stages.
The experiments of real fisheye images demonstrate that our solution achieves state-of-the-art accuracy while being real-time in practice.
- Score: 3.485767750936058
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The 3D visual perception for vehicles with the surround-view fisheye camera
system is a critical and challenging task for low-cost urban autonomous
driving. While existing monocular 3D object detection methods perform not well
enough on the fisheye images for mass production, partly due to the lack of 3D
datasets of such images. In this paper, we manage to overcome and avoid the
difficulty of acquiring the large scale of accurate 3D labeled truth data, by
breaking down the 3D object detection task into some sub-tasks, such as
vehicle's contact point detection, type classification, re-identification and
unit assembling, etc. Particularly, we propose the concept of Multidimensional
Vector to include the utilizable information generated in different dimensions
and stages, instead of the descriptive approach for the bird's eye view (BEV)
or a cube of eight points. The experiments of real fisheye images demonstrate
that our solution achieves state-of-the-art accuracy while being real-time in
practice.
Related papers
- Weakly Supervised Monocular 3D Detection with a Single-View Image [58.57978772009438]
Monocular 3D detection aims for precise 3D object localization from a single-view image.
We propose SKD-WM3D, a weakly supervised monocular 3D detection framework.
We show that SKD-WM3D surpasses the state-of-the-art clearly and is even on par with many fully supervised methods.
arXiv Detail & Related papers (2024-02-29T13:26:47Z) - MagicDrive: Street View Generation with Diverse 3D Geometry Control [82.69871576797166]
We introduce MagicDrive, a novel street view generation framework, offering diverse 3D geometry controls.
Our design incorporates a cross-view attention module, ensuring consistency across multiple camera views.
arXiv Detail & Related papers (2023-10-04T06:14:06Z) - NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization [80.3424839706698]
We present NeurOCS, a framework that uses instance masks 3D boxes as input to learn 3D object shapes by means of differentiable rendering.
Our approach rests on insights in learning a category-level shape prior directly from real driving scenes.
We make critical design choices to learn object coordinates more effectively from an object-centric view.
arXiv Detail & Related papers (2023-05-28T16:18:41Z) - 3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space.
We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects.
Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z) - Surround-View Vision-based 3D Detection for Autonomous Driving: A Survey [0.6091702876917281]
We provide a literature survey for the existing Vision Based 3D detection methods, focused on autonomous driving.
We have highlighted how the literature and industry trend have moved towards surround-view image based methods and note down thoughts on what special cases this method addresses.
arXiv Detail & Related papers (2023-02-13T19:30:17Z) - Learning Ego 3D Representation as Ray Tracing [42.400505280851114]
We present a novel end-to-end architecture for ego 3D representation learning from unconstrained camera views.
Inspired by the ray tracing principle, we design a polarized grid of "imaginary eyes" as the learnable ego 3D representation.
We show that our model outperforms all state-of-the-art alternatives significantly.
arXiv Detail & Related papers (2022-06-08T17:55:50Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - Monocular 3D Vehicle Detection Using Uncalibrated Traffic Cameras
through Homography [12.062095895630563]
This paper proposes a method to extract the position and pose of vehicles in the 3D world from a single traffic camera.
We observe that the homography between the road plane and the image plane is essential to 3D vehicle detection.
We propose a new regression target called textittailedr-box and a textitdual-view network architecture which boosts the detection accuracy on warped BEV images.
arXiv Detail & Related papers (2021-03-29T02:57:37Z) - Monocular Differentiable Rendering for Self-Supervised 3D Object
Detection [21.825158925459732]
3D object detection from monocular images is an ill-posed problem due to the projective entanglement of depth and scale.
We present a novel self-supervised method for textured 3D shape reconstruction and pose estimation of rigid objects.
Our method predicts the 3D location and meshes of each object in an image using differentiable rendering and a self-supervised objective.
arXiv Detail & Related papers (2020-09-30T09:21:43Z) - Kinematic 3D Object Detection in Monocular Video [123.7119180923524]
We propose a novel method for monocular video-based 3D object detection which carefully leverages kinematic motion to improve precision of 3D localization.
We achieve state-of-the-art performance on monocular 3D object detection and the Bird's Eye View tasks within the KITTI self-driving dataset.
arXiv Detail & Related papers (2020-07-19T01:15:12Z) - 3D Object Detection from a Single Fisheye Image Without a Single Fisheye
Training Image [7.86363825307044]
We show how to use existing monocular 3D object detection models, trained only on rectilinear images, to detect 3D objects in images from fisheye cameras.
We outperform the only existing method for monocular 3D object detection in panoramas on a benchmark of synthetic data.
arXiv Detail & Related papers (2020-03-08T11:03:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.