DeepSTEP -- Deep Learning-Based Spatio-Temporal End-To-End Perception
for Autonomous Vehicles
- URL: http://arxiv.org/abs/2305.06820v1
- Date: Thu, 11 May 2023 14:13:37 GMT
- Title: DeepSTEP -- Deep Learning-Based Spatio-Temporal End-To-End Perception
for Autonomous Vehicles
- Authors: Sebastian Huch, Florian Sauerbeck, Johannes Betz
- Abstract summary: We present our concept for an end-to-end perception architecture, named DeepSTEP.
DeepSTEP processes raw sensor data from the camera, LiDAR, and RaDAR, and combines the extracted data in a deep fusion network.
The architecture's end-to-end design, time-aware attention mechanism, and integration of multiple perception tasks make it a promising solution for real-world deployment.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous vehicles demand high accuracy and robustness of perception
algorithms. To develop efficient and scalable perception algorithms, the
maximum information should be extracted from the available sensor data. In this
work, we present our concept for an end-to-end perception architecture, named
DeepSTEP. The deep learning-based architecture processes raw sensor data from
the camera, LiDAR, and RaDAR, and combines the extracted data in a deep fusion
network. The output of this deep fusion network is a shared feature space,
which is used by perception head networks to fulfill several perception tasks,
such as object detection or local mapping. DeepSTEP incorporates multiple ideas
to advance state of the art: First, combining detection and localization into a
single pipeline allows for efficient processing to reduce computational
overhead and further improves overall performance. Second, the architecture
leverages the temporal domain by using a self-attention mechanism that focuses
on the most important features. We believe that our concept of DeepSTEP will
advance the development of end-to-end perception systems. The network will be
deployed on our research vehicle, which will be used as a platform for data
collection, real-world testing, and validation. In conclusion, DeepSTEP
represents a significant advancement in the field of perception for autonomous
vehicles. The architecture's end-to-end design, time-aware attention mechanism,
and integration of multiple perception tasks make it a promising solution for
real-world deployment. This research is a work in progress and presents the
first concept of establishing a novel perception pipeline.
Related papers
- V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric
Heterogenous Distillation Network [13.248981195106069]
We propose a multi-view vehicle-road cooperation perception system, vehicle-to-everything cooperative perception (V2X-AHD)
The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study.
arXiv Detail & Related papers (2023-10-10T13:12:03Z) - Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection [54.041049052843604]
We present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection.
First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network.
Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match.
arXiv Detail & Related papers (2023-07-01T13:53:14Z) - A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z) - VINet: Lightweight, Scalable, and Heterogeneous Cooperative Perception
for 3D Object Detection [15.195933965761645]
Cooperative Perception (CP) has emerged to significantly advance the perception of automated driving.
We introduce VINet, a unified deep learning-based CP network for scalable, lightweight, and heterogeneous cooperative 3D object detection.
VINet can reduce 84% system-level computational cost and 94% system-level communication cost while improving the 3D detection accuracy.
arXiv Detail & Related papers (2022-12-14T07:03:23Z) - Lightweight Monocular Depth Estimation with an Edge Guided Network [34.03711454383413]
We present a novel lightweight Edge Guided Depth Estimation Network (EGD-Net)
In particular, we start out with a lightweight encoder-decoder architecture and embed an edge guidance branch.
In order to aggregate the context information and edge attention features, we design a transformer-based feature aggregation module.
arXiv Detail & Related papers (2022-09-29T14:45:47Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal
Feature Learning [132.20119288212376]
We propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously.
To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system.
arXiv Detail & Related papers (2022-07-15T16:57:43Z) - RealNet: Combining Optimized Object Detection with Information Fusion
Depth Estimation Co-Design Method on IoT [2.9275056713717285]
We propose a co-design method combining the model-streamlined recognition algorithm, the depth estimation algorithm, and information fusion.
The method proposed in this paper is suitable for mobile platforms with high real-time request.
arXiv Detail & Related papers (2022-04-24T08:35:55Z) - On Deep Learning Techniques to Boost Monocular Depth Estimation for
Autonomous Navigation [1.9007546108571112]
Inferring the depth of images is a fundamental inverse problem within the field of Computer Vision.
We propose a new lightweight and fast supervised CNN architecture combined with novel feature extraction models.
We also introduce an efficient surface normals module, jointly with a simple geometric 2.5D loss function, to solve SIDE problems.
arXiv Detail & Related papers (2020-10-13T18:37:38Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z) - Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and
On-Device Inference [49.88536971774444]
Inertial measurements units (IMUs) are small, cheap, energy efficient, and widely employed in smart devices and mobile robots.
Exploiting inertial data for accurate and reliable pedestrian navigation supports is a key component for emerging Internet-of-Things applications and services.
We present and release the Oxford Inertial Odometry dataset (OxIOD), a first-of-its-kind public dataset for deep learning based inertial navigation research.
arXiv Detail & Related papers (2020-01-13T04:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.