Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation
- URL: http://arxiv.org/abs/2012.08055v2
- Date: Wed, 6 Jan 2021 09:04:58 GMT
- Title: Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation
- Authors: Feixiang Lu, Zongdai Liu, Hui Miao, Peng Wang, Liangjun Zhang, Ruigang
Yang, Dinesh Manocha, Bin Zhou
- Abstract summary: We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
- Score: 77.60050239225086
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Holistically understanding an object and its 3D movable parts through visual
perception models is essential for enabling an autonomous agent to interact
with the world. For autonomous driving, the dynamics and states of vehicle
parts such as doors, the trunk, and the bonnet can provide meaningful semantic
information and interaction states, which are essential to ensuring the safety
of the self-driving vehicle. Existing visual perception models mainly focus on
coarse parsing such as object bounding box detection or pose estimation and
rarely tackle these situations. In this paper, we address this important
autonomous driving problem by solving three critical issues. First, to deal
with data scarcity, we propose an effective training data generation process by
fitting a 3D car model with dynamic parts to vehicles in real images before
reconstructing human-vehicle interaction (VHI) scenarios. Our approach is fully
automatic without any human interaction, which can generate a large number of
vehicles in uncommon states (VUS) for training deep neural networks (DNNs).
Second, to perform fine-grained vehicle perception, we present a multi-task
network for VUS parsing and a multi-stream network for VHI parsing. Third, to
quantitatively evaluate the effectiveness of our data augmentation approach, we
build the first VUS dataset in real traffic scenarios (e.g., getting on/out or
placing/removing luggage). Experimental results show that our approach advances
other baseline methods in 2D detection and instance segmentation by a big
margin (over 8%). In addition, our network yields large improvements in
discovering and understanding these uncommon cases. Moreover, we have released
the source code, the dataset, and the trained model on Github
(https://github.com/zongdai/EditingForDNN).
Related papers
- V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric
Heterogenous Distillation Network [13.248981195106069]
We propose a multi-view vehicle-road cooperation perception system, vehicle-to-everything cooperative perception (V2X-AHD)
The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study.
arXiv Detail & Related papers (2023-10-10T13:12:03Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - DOLPHINS: Dataset for Collaborative Perception enabled Harmonious and
Interconnected Self-driving [19.66714697653504]
Vehicle-to-Everything (V2X) network has enabled collaborative perception in autonomous driving.
The lack of datasets has severely blocked the development of collaborative perception algorithms.
We release DOLPHINS: dataset for cOllaborative Perception enabled Harmonious and INterconnected Self-driving.
arXiv Detail & Related papers (2022-07-15T17:07:07Z) - Weakly Supervised Training of Monocular 3D Object Detectors Using Wide
Baseline Multi-view Traffic Camera Data [19.63193201107591]
7DoF prediction of vehicles at an intersection is an important task for assessing potential conflicts between road users.
We develop an approach using a weakly supervised method of fine tuning 3D object detectors for traffic observation cameras.
Our method achieves vehicle 7DoF pose prediction accuracy on our dataset comparable to the top performing monocular 3D object detectors on autonomous vehicle datasets.
arXiv Detail & Related papers (2021-10-21T08:26:48Z) - One Million Scenes for Autonomous Driving: ONCE Dataset [91.94189514073354]
We introduce the ONCE dataset for 3D object detection in the autonomous driving scenario.
The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available.
We reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.
arXiv Detail & Related papers (2021-06-21T12:28:08Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z) - VehicleNet: Learning Robust Visual Representation for Vehicle
Re-identification [116.1587709521173]
We propose to build a large-scale vehicle dataset (called VehicleNet) by harnessing four public vehicle datasets.
We design a simple yet effective two-stage progressive approach to learning more robust visual representation from VehicleNet.
We achieve the state-of-art accuracy of 86.07% mAP on the private test set of AICity Challenge.
arXiv Detail & Related papers (2020-04-14T05:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.