Automatic Map Update Using Dashcam Videos
- URL: http://arxiv.org/abs/2109.12131v1
- Date: Fri, 24 Sep 2021 18:00:57 GMT
- Title: Automatic Map Update Using Dashcam Videos
- Authors: Aziza Zhanabatyrova, Clayton Souza Leite, Yu Xiao
- Abstract summary: We propose an SfM-based solution for automatic map update, with a focus on real-time change detection and localization.
Our system can locate the objects detected from 2D images in a 3D space, utilizing sparse SfM point clouds.
- Score: 1.6911482053867475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous driving requires 3D maps that provide accurate and up-to-date
information about semantic landmarks. Due to the wider availability and lower
cost of cameras compared with laser scanners, vision-based mapping has
attracted much attention from academia and industry. Among the existing
solutions, Structure-from-Motion (SfM) technology has proved to be feasible for
building 3D maps from crowdsourced data, since it allows unordered images as
input. Previous works on SfM have mainly focused on issues related to building
3D point clouds and calculating camera poses, leaving the issues of automatic
change detection and localization open.
We propose in this paper an SfM-based solution for automatic map update, with
a focus on real-time change detection and localization. Our solution builds on
comparison of semantic map data (e.g. types and locations of traffic signs).
Through a novel design of the pixel-wise 3D localization algorithm, our system
can locate the objects detected from 2D images in a 3D space, utilizing sparse
SfM point clouds. Experiments with dashcam videos collected from two urban
areas prove that the system is able to locate visible traffic signs in front
along the driving direction with a median distance error of 1.52 meters.
Moreover, it can detect up to 80\% of the changes with a median distance error
of 2.21 meters. The result analysis also shows the potential of significantly
improving the system performance in the future by increasing the accuracy of
the background technology in use, including in particularly the object
detection and point cloud geo-registration algorithms.
Related papers
- HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective [11.841338298700421]
We propose a novel 3D object detection framework integrating Spatial Former and Voxel Pooling Former to enhance 2D-to-3D projection based on height estimation.
Experiments were conducted using the Rope3D and DAIR-V2X-I dataset, and the results demonstrated the outperformance of the proposed algorithm in the detection of both vehicles and cyclists.
arXiv Detail & Related papers (2024-10-10T09:37:33Z) - OCTraN: 3D Occupancy Convolutional Transformer Network in Unstructured
Traffic Scenarios [0.0]
We propose OCTraN, a transformer architecture that uses iterative-attention to convert 2D image features into 3D occupancy features.
We also develop a self-supervised training pipeline to generalize the model to any scene by eliminating the need for LiDAR ground truth.
arXiv Detail & Related papers (2023-07-20T15:06:44Z) - 3D Data Augmentation for Driving Scenes on Camera [50.41413053812315]
We propose a 3D data augmentation approach termed Drive-3DAug, aiming at augmenting the driving scenes on camera in the 3D space.
We first utilize Neural Radiance Field (NeRF) to reconstruct the 3D models of background and foreground objects.
Then, augmented driving scenes can be obtained by placing the 3D objects with adapted location and orientation at the pre-defined valid region of backgrounds.
arXiv Detail & Related papers (2023-03-18T05:51:05Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using
Learned 2D-3D Point-Line Correspondences [29.419138863851526]
Given a query image, the goal is to estimate the camera pose corresponding to the prior map.
Existing approaches rely heavily on dense point descriptors at the feature level to solve the registration problem.
We propose a sparse semantic map-based monocular localization method, which solves 2D-3D registration via a well-designed deep neural network.
arXiv Detail & Related papers (2022-10-10T10:29:07Z) - Soft Expectation and Deep Maximization for Image Feature Detection [68.8204255655161]
We propose SEDM, an iterative semi-supervised learning process that flips the question and first looks for repeatable 3D points, then trains a detector to localize them in image space.
Our results show that this new model trained using SEDM is able to better localize the underlying 3D points in a scene.
arXiv Detail & Related papers (2021-04-21T00:35:32Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Monocular Vision based Crowdsourced 3D Traffic Sign Positioning with
Unknown Camera Intrinsics and Distortion Coefficients [11.38332845467423]
We demonstrate an approach to computing 3D traffic sign positions without knowing the camera focal lengths, principal point, and distortion coefficients a priori.
We achieve an average single journey relative and absolute positioning accuracy of 0.26 m and 1.38 m, respectively.
arXiv Detail & Related papers (2020-07-09T07:03:17Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z) - Road Curb Detection and Localization with Monocular Forward-view Vehicle
Camera [74.45649274085447]
We propose a robust method for estimating road curb 3D parameters using a calibrated monocular camera equipped with a fisheye lens.
Our approach is able to estimate the vehicle to curb distance in real time with mean accuracy of more than 90%.
arXiv Detail & Related papers (2020-02-28T00:24:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.