SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround
View Fisheye Cameras
- URL: http://arxiv.org/abs/2104.04420v1
- Date: Fri, 9 Apr 2021 15:20:20 GMT
- Title: SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround
View Fisheye Cameras
- Authors: Varun Ravi Kumar, Marvin Klingner, Senthil Yogamani, Markus Bach,
Stefan Milz, Tim Fingscheidt and Patrick M\"ader
- Abstract summary: A 360deg perception of scene geometry is essential for automated driving, notably for parking and urban driving scenarios.
We present novel camera-geometry adaptive multi-scale convolutions which utilize the camera parameters as a conditional input.
We evaluate our approach on the Fisheye WoodScape surround-view dataset, significantly improving over previous approaches.
- Score: 30.480562747903186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A 360{\deg} perception of scene geometry is essential for automated driving,
notably for parking and urban driving scenarios. Typically, it is achieved
using surround-view fisheye cameras, focusing on the near-field area around the
vehicle. The majority of current depth estimation approaches focus on employing
just a single camera, which cannot be straightforwardly generalized to multiple
cameras. The depth estimation model must be tested on a variety of cameras
equipped to millions of cars with varying camera geometries. Even within a
single car, intrinsics vary due to manufacturing tolerances. Deep learning
models are sensitive to these changes, and it is practically infeasible to
train and test on each camera variant. As a result, we present novel
camera-geometry adaptive multi-scale convolutions which utilize the camera
parameters as a conditional input, enabling the model to generalize to
previously unseen fisheye cameras. Additionally, we improve the distance
estimation by pairwise and patchwise vector-based self-attention encoder
networks. We evaluate our approach on the Fisheye WoodScape surround-view
dataset, significantly improving over previous approaches. We also show a
generalization of our approach across different camera viewing angles and
perform extensive experiments to support our contributions. To enable
comparison with other approaches, we evaluate the front camera data on the
KITTI dataset (pinhole camera images) and achieve state-of-the-art performance
among self-supervised monocular methods. An overview video with qualitative
results is provided at https://youtu.be/bmX0UcU9wtA. Baseline code and dataset
will be made public.
Related papers
- FisheyeDetNet: 360° Surround view Fisheye Camera based Object Detection System for Autonomous Driving [4.972459365804512]
Object detection is a mature problem in autonomous driving with pedestrian detection being one of the first deployed algorithms.
Standard bounding box representation fails in fisheye cameras due to heavy radial distortion, particularly in the periphery.
We design rotated bounding boxes, ellipse, generic polygon as polar arc/angle representations and define an instance segmentation mIOU metric to analyze these representations.
The proposed model FisheyeDetNet with polygon outperforms others and achieves a mAP score of 49.5 % on Valeo fisheye surround-view dataset for automated driving applications.
arXiv Detail & Related papers (2024-04-20T18:50:57Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - FoVA-Depth: Field-of-View Agnostic Depth Estimation for Cross-Dataset
Generalization [57.98448472585241]
We propose a method to train a stereo depth estimation model on the widely available pinhole data.
We show strong generalization ability of our approach on both indoor and outdoor datasets.
arXiv Detail & Related papers (2024-01-24T20:07:59Z) - Towards Viewpoint Robustness in Bird's Eye View Segmentation [85.99907496019972]
We study how AV perception models are affected by changes in camera viewpoint.
Small changes to pitch, yaw, depth, or height of the camera at inference time lead to large drops in performance.
We introduce a technique for novel view synthesis and use it to transform collected data to the viewpoint of target rigs.
arXiv Detail & Related papers (2023-09-11T02:10:07Z) - Surround-view Fisheye Camera Perception for Automated Driving: Overview,
Survey and Challenges [1.4452405977630436]
Four fisheye cameras on four sides of the vehicle are sufficient to cover 360deg around the vehicle capturing the entire near-field region.
Some primary use cases are automated parking, traffic jam assist, and urban driving.
Due to the large radial distortion of fisheye cameras, standard algorithms can not be extended easily to the surround-view use case.
arXiv Detail & Related papers (2022-05-26T11:38:04Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Rope3D: TheRoadside Perception Dataset for Autonomous Driving and
Monocular 3D Object Detection Task [48.555440807415664]
We present the first high-diversity challenging Roadside Perception 3D dataset- Rope3D from a novel view.
The dataset consists of 50k images and over 1.5M 3D objects in various scenes.
We propose to leverage the geometry constraint to solve the inherent ambiguities caused by various sensors, viewpoints.
arXiv Detail & Related papers (2022-03-25T12:13:23Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - OmniDet: Surround View Cameras based Multi-task Visual Perception
Network for Autonomous Driving [10.3540046389057]
This work presents a multi-task visual perception network on unrectified fisheye images.
It consists of six primary tasks necessary for an autonomous driving system.
We demonstrate that the jointly trained model performs better than the respective single task versions.
arXiv Detail & Related papers (2021-02-15T10:46:24Z) - Generalized Object Detection on Fisheye Cameras for Autonomous Driving:
Dataset, Representations and Baseline [5.1450366450434295]
We explore better representations like oriented bounding box, ellipse, and generic polygon for object detection in fisheye images.
We design a novel curved bounding box model that has optimal properties for fisheye distortion models.
It is the first detailed study on object detection on fisheye cameras for autonomous driving scenarios.
arXiv Detail & Related papers (2020-12-03T18:00:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.