Cross-modal Learning of Graph Representations using Radar Point Cloud
for Long-Range Gesture Recognition
- URL: http://arxiv.org/abs/2203.17066v1
- Date: Thu, 31 Mar 2022 14:34:36 GMT
- Title: Cross-modal Learning of Graph Representations using Radar Point Cloud
for Long-Range Gesture Recognition
- Authors: Souvik Hazra, Hao Feng, Gamze Naz Kiprit, Michael Stephan, Lorenzo
Servadei, Robert Wille, Robert Weigel, Avik Santra
- Abstract summary: We propose a novel architecture for a long-range (1m - 2m) gesture recognition solution.
We use a point cloud-based cross-learning approach from camera point cloud to 60-GHz FMCW radar point cloud.
In the experimental results section, we demonstrate our model's overall accuracy of 98.4% for five gestures and its generalization capability.
- Score: 6.9545038359818445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gesture recognition is one of the most intuitive ways of interaction and has
gathered particular attention for human computer interaction. Radar sensors
possess multiple intrinsic properties, such as their ability to work in low
illumination, harsh weather conditions, and being low-cost and compact, making
them highly preferable for a gesture recognition solution. However, most
literature work focuses on solutions with a limited range that is lower than a
meter. We propose a novel architecture for a long-range (1m - 2m) gesture
recognition solution that leverages a point cloud-based cross-learning approach
from camera point cloud to 60-GHz FMCW radar point cloud, which allows learning
better representations while suppressing noise. We use a variant of Dynamic
Graph CNN (DGCNN) for the cross-learning, enabling us to model relationships
between the points at a local and global level and to model the temporal
dynamics a Bi-LSTM network is employed. In the experimental results section, we
demonstrate our model's overall accuracy of 98.4% for five gestures and its
generalization capability.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for
Place Recognition [11.206532393178385]
We present a novel neural network named LCPR for robust multimodal place recognition.
Our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance.
arXiv Detail & Related papers (2023-11-06T15:39:48Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - Agile gesture recognition for capacitive sensing devices: adapting
on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller.
The controller generates real-time signals from each of the wearer five fingers.
We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z) - Improved Static Hand Gesture Classification on Deep Convolutional Neural
Networks using Novel Sterile Training Technique [2.534406146337704]
Non-contact hand pose and static gesture recognition have received considerable attention in many applications.
This article presents an efficient data collection approach and a novel technique for deep CNN training by introducing sterile'' images.
Applying the proposed data collection and training methods yields an increase in classification rate of static hand gestures from $85%$ to $93%$.
arXiv Detail & Related papers (2023-05-03T11:10:50Z) - Rethinking Range View Representation for LiDAR Segmentation [66.73116059734788]
"Many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections.
We present RangeFormer, a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing.
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-09T16:13:27Z) - Gesture Recognition with Keypoint and Radar Stream Fusion for Automated
Vehicles [13.652770928249447]
We present a joint camera and radar approach to enable autonomous vehicles to understand and react to human gestures in everyday traffic.
We propose a fusion neural network for both modalities, including an auxiliary loss for each modality.
Motivated by adverse weather conditions, we also demonstrate promising performance when one of the sensors lacks functionality.
arXiv Detail & Related papers (2023-02-20T14:18:11Z) - HDNet: Hierarchical Dynamic Network for Gait Recognition using
Millimeter-Wave Radar [13.19744551082316]
We propose a Hierarchical Dynamic Network (HDNet) for gait recognition using mmWave radar.
To prove the superiority of our methods, we perform extensive experiments on two public mmWave radar-based gait recognition datasets.
arXiv Detail & Related papers (2022-11-01T07:34:22Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - Towards Domain-Independent and Real-Time Gesture Recognition Using
mmWave Signal [11.76969975145963]
DI-Gesture is a domain-independent and real-time mmWave gesture recognition system.
In real-time scenario, the accuracy of DI-Gesutre reaches over 97% with average inference time of 2.87ms.
arXiv Detail & Related papers (2021-11-11T13:28:28Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.