Gesture Recognition with Keypoint and Radar Stream Fusion for Automated
Vehicles
- URL: http://arxiv.org/abs/2302.09998v1
- Date: Mon, 20 Feb 2023 14:18:11 GMT
- Title: Gesture Recognition with Keypoint and Radar Stream Fusion for Automated
Vehicles
- Authors: Adrian Holzbock, Nicolai Kern, Christian Waldschmidt, Klaus Dietmayer,
Vasileios Belagiannis
- Abstract summary: We present a joint camera and radar approach to enable autonomous vehicles to understand and react to human gestures in everyday traffic.
We propose a fusion neural network for both modalities, including an auxiliary loss for each modality.
Motivated by adverse weather conditions, we also demonstrate promising performance when one of the sensors lacks functionality.
- Score: 13.652770928249447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a joint camera and radar approach to enable autonomous vehicles to
understand and react to human gestures in everyday traffic. Initially, we
process the radar data with a PointNet followed by a spatio-temporal multilayer
perceptron (stMLP). Independently, the human body pose is extracted from the
camera frame and processed with a separate stMLP network. We propose a fusion
neural network for both modalities, including an auxiliary loss for each
modality. In our experiments with a collected dataset, we show the advantages
of gesture recognition with two modalities. Motivated by adverse weather
conditions, we also demonstrate promising performance when one of the sensors
lacks functionality.
Related papers
- Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System [0.0]
We propose a novel approach to address the problem of camera and radar sensor fusion for 3D object detection in autonomous vehicle perception systems.
Our approach builds on recent advances in deep learning and leverages the strengths of both sensors to improve object detection performance.
Our results show that the proposed approach achieves superior performance over single-sensor solutions and could directly compete with other top-level fusion methods.
arXiv Detail & Related papers (2024-04-25T12:04:31Z) - ASY-VRNet: Waterway Panoptic Driving Perception Model based on Asymmetric Fair Fusion of Vision and 4D mmWave Radar [7.2865477881451755]
Asymmetric Fair Fusion (AFF) modules designed to efficiently interact with independent features from both visual and radar modalities.
ASY-VRNet model processes image and radar features based on irregular super-pixel point sets.
Compared to other lightweight models, ASY-VRNet achieves state-of-the-art performance in object detection, semantic segmentation, and drivable-area segmentation.
arXiv Detail & Related papers (2023-08-20T14:53:27Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - A Spatio-Temporal Multilayer Perceptron for Gesture Recognition [70.34489104710366]
We propose a multilayer state-weighted perceptron for gesture recognition in the context of autonomous vehicles.
An evaluation of TCG and Drive&Act datasets is provided to showcase the promising performance of our approach.
We deploy our model to our autonomous vehicle to show its real-time capability and stable execution.
arXiv Detail & Related papers (2022-04-25T08:42:47Z) - A Quality Index Metric and Method for Online Self-Assessment of
Autonomous Vehicles Sensory Perception [164.93739293097605]
We propose a novel evaluation metric, named as the detection quality index (DQI), which assesses the performance of camera-based object detection algorithms.
We have developed a superpixel-based attention network (SPA-NET) that utilizes raw image pixels and superpixels as input to predict the proposed DQI evaluation metric.
arXiv Detail & Related papers (2022-03-04T22:16:50Z) - CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object
Tracking [9.62721286522053]
We propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion.
Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association.
We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark.
arXiv Detail & Related papers (2021-07-11T23:56:53Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - YOdar: Uncertainty-based Sensor Fusion for Vehicle Detection with Camera
and Radar Sensors [4.396860522241306]
We present an uncertainty-based method for sensor fusion with camera and radar data.
In our experiments we combine the YOLOv3 object detection network with a customized $1D$ radar segmentation network.
Our experiments show, that this approach of uncertainty aware fusion significantly gains performance compared to single sensor baselines.
arXiv Detail & Related papers (2020-10-07T10:40:02Z) - Towards Autonomous Driving: a Multi-Modal 360$^{\circ}$ Perception
Proposal [87.11988786121447]
This paper presents a framework for 3D object detection and tracking for autonomous vehicles.
The solution, based on a novel sensor fusion configuration, provides accurate and reliable road environment detection.
A variety of tests of the system, deployed in an autonomous vehicle, have successfully assessed the suitability of the proposed perception stack.
arXiv Detail & Related papers (2020-08-21T20:36:21Z) - End-to-end Learning for Inter-Vehicle Distance and Relative Velocity
Estimation in ADAS with a Monocular Camera [81.66569124029313]
We propose a camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network.
The key novelty of our method is the integration of multiple visual clues provided by any two time-consecutive monocular frames.
We also propose a vehicle-centric sampling mechanism to alleviate the effect of perspective distortion in the motion field.
arXiv Detail & Related papers (2020-06-07T08:18:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.