Multi-Stream Attention Learning for Monocular Vehicle Velocity and
Inter-Vehicle Distance Estimation
- URL: http://arxiv.org/abs/2110.11608v1
- Date: Fri, 22 Oct 2021 06:14:12 GMT
- Title: Multi-Stream Attention Learning for Monocular Vehicle Velocity and
Inter-Vehicle Distance Estimation
- Authors: Kuan-Chih Huang, Yu-Kai Huang, Winston H. Hsu
- Abstract summary: Vehicle velocity and inter-vehicle distance estimation are essential for ADAS (Advanced driver-assistance systems) and autonomous vehicles.
Recent studies focus on using a low-cost monocular camera to perceive the environment around the vehicle in a data-driven fashion.
MSANet is proposed to extract different aspects of features, e.g., spatial and contextual features, for joint vehicle velocity and inter-vehicle distance estimation.
- Score: 25.103483428654375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vehicle velocity and inter-vehicle distance estimation are essential for ADAS
(Advanced driver-assistance systems) and autonomous vehicles. To save the cost
of expensive ranging sensors, recent studies focus on using a low-cost
monocular camera to perceive the environment around the vehicle in a
data-driven fashion. Existing approaches treat each vehicle independently for
perception and cause inconsistent estimation. Furthermore, important
information like context and spatial relation in 2D object detection is often
neglected in the velocity estimation pipeline. In this paper, we explore the
relationship between vehicles of the same frame with a
global-relative-constraint (GLC) loss to encourage consistent estimation. A
novel multi-stream attention network (MSANet) is proposed to extract different
aspects of features, e.g., spatial and contextual features, for joint vehicle
velocity and inter-vehicle distance estimation. Experiments show the
effectiveness and robustness of our proposed approach. MSANet outperforms
state-of-the-art algorithms on both the KITTI dataset and TuSimple velocity
dataset.
Related papers
- CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions [13.981748780317329]
Accurately and promptly predicting accidents among surrounding traffic agents from camera footage is crucial for the safety of autonomous vehicles (AVs)
This study introduces a novel accident anticipation framework for AVs, termed CRASH.
It seamlessly integrates five components: object detector, feature extractor, object-aware module, context-aware module, and multi-layer fusion.
Our model surpasses existing top baselines in critical evaluation metrics like Average Precision (AP) and mean Time-To-Accident (mTTA)
arXiv Detail & Related papers (2024-07-25T04:12:49Z) - Spatial-Temporal Generative AI for Traffic Flow Estimation with Sparse Data of Connected Vehicles [48.32593099620544]
Traffic flow estimation (TFE) is crucial for intelligent transportation systems.
This paper introduces a novel and cost-effective TFE framework that leverages sparse,temporal generative artificial intelligence (GAI) framework.
Within this framework, the conditional encoder mines spatial-temporal correlations in the initial TFE results.
arXiv Detail & Related papers (2024-07-10T20:26:04Z) - Robust and Fast Vehicle Detection using Augmented Confidence Map [10.261351772602543]
We introduce the concept of augmentation which highlights the region of interest containing the vehicles.
The output of MR-MSER is supplied to fast CNN to generate a confidence map.
Unlike existing models that implement complicated models for vehicle detection, we explore the combination of a rough set and fuzzy-based models.
arXiv Detail & Related papers (2023-04-27T18:41:16Z) - Federated Deep Learning Meets Autonomous Vehicle Perception: Design and
Verification [168.67190934250868]
Federated learning empowered connected autonomous vehicle (FLCAV) has been proposed.
FLCAV preserves privacy while reducing communication and annotation costs.
It is challenging to determine the network resources and road sensor poses for multi-stage training.
arXiv Detail & Related papers (2022-06-03T23:55:45Z) - A Spatio-Temporal Multilayer Perceptron for Gesture Recognition [70.34489104710366]
We propose a multilayer state-weighted perceptron for gesture recognition in the context of autonomous vehicles.
An evaluation of TCG and Drive&Act datasets is provided to showcase the promising performance of our approach.
We deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
arXiv Detail & Related papers (2022-04-25T08:42:47Z) - Real Time Monocular Vehicle Velocity Estimation using Synthetic Data [78.85123603488664]
We look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car.
We propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity.
arXiv Detail & Related papers (2021-09-16T13:10:27Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z) - High-Precision Digital Traffic Recording with Multi-LiDAR Infrastructure
Sensor Setups [0.0]
We investigate the impact of fused LiDAR point clouds compared to single LiDAR point clouds.
The evaluation of the extracted trajectories shows that a fused infrastructure approach significantly increases the tracking results and reaches accuracies within a few centimeters.
arXiv Detail & Related papers (2020-06-22T10:57:52Z) - End-to-end Learning for Inter-Vehicle Distance and Relative Velocity
Estimation in ADAS with a Monocular Camera [81.66569124029313]
We propose a camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network.
The key novelty of our method is the integration of multiple visual clues provided by any two time-consecutive monocular frames.
We also propose a vehicle-centric sampling mechanism to alleviate the effect of perspective distortion in the motion field.
arXiv Detail & Related papers (2020-06-07T08:18:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.