Attention-based Dual-stream Vision Transformer for Radar Gait
Recognition
- URL: http://arxiv.org/abs/2111.12290v1
- Date: Wed, 24 Nov 2021 06:16:53 GMT
- Title: Attention-based Dual-stream Vision Transformer for Radar Gait
Recognition
- Authors: Shiliang Chen, Wentao He, Jianfeng Ren, Xudong Jiang
- Abstract summary: Radar gait recognition is robust to light variations and less infringement on privacy.
In this work, a dual-stream neural network with attention-based fusion is proposed to fully aggregate the discriminant information.
The proposed method is validated on a large benchmark dataset for radar gait recognition, which shows that it significantly outperforms state-of-the-art solutions.
- Score: 24.90100456414406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Radar gait recognition is robust to light variations and less infringement on
privacy. Previous studies often utilize either spectrograms or cadence velocity
diagrams. While the former shows the time-frequency patterns, the latter
encodes the repetitive frequency patterns. In this work, a dual-stream neural
network with attention-based fusion is proposed to fully aggregate the
discriminant information from these two representations. The both streams are
designed based on the Vision Transformer, which well captures the gait
characteristics embedded in these representations. The proposed method is
validated on a large benchmark dataset for radar gait recognition, which shows
that it significantly outperforms state-of-the-art solutions.
Related papers
- Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods.
Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z) - G3R: Generating Rich and Fine-grained mmWave Radar Data from 2D Videos for Generalized Gesture Recognition [19.95047010486547]
We develop a software pipeline that exploits wealthy 2D videos to generate realistic radar data.
It addresses the challenge of simulating diversified and fine-grained reflection properties of user gestures.
We implement and evaluate G3R using 2D videos from public data sources and self-collected real-world radar data.
arXiv Detail & Related papers (2024-04-23T11:22:59Z) - Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - Distillation-guided Representation Learning for Unconstrained Gait Recognition [50.0533243584942]
We propose a framework, termed GAit DEtection and Recognition (GADER), for human authentication in challenging outdoor scenarios.
GADER builds discriminative features through a novel gait recognition method, where only frames containing gait information are used.
We evaluate our method on multiple State-of-The-Arts(SoTA) gait baselines and demonstrate consistent improvements on indoor and outdoor datasets.
arXiv Detail & Related papers (2023-07-27T01:53:57Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - T-FFTRadNet: Object Detection with Swin Vision Transformers from Raw ADC
Radar Signals [0.0]
Object detection utilizing Frequency Modulated Continous Wave radar is becoming increasingly popular in the field of autonomous systems.
Radar does not possess the same drawbacks seen by other emission-based sensors such as LiDAR, primarily the degradation or loss of return signals due to weather conditions such as rain or snow.
We introduce hierarchical Swin Vision transformers to the field of radar object detection and show their capability to operate on inputs varying in pre-processing, along with different radar configurations.
arXiv Detail & Related papers (2023-03-29T18:04:19Z) - HDNet: Hierarchical Dynamic Network for Gait Recognition using
Millimeter-Wave Radar [13.19744551082316]
We propose a Hierarchical Dynamic Network (HDNet) for gait recognition using mmWave radar.
To prove the superiority of our methods, we perform extensive experiments on two public mmWave radar-based gait recognition datasets.
arXiv Detail & Related papers (2022-11-01T07:34:22Z) - Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in
VIS and NIR Scenario [87.72258480670627]
Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum compared to the real images.
This paper proposes a Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spatial-temporal feature representation.
arXiv Detail & Related papers (2022-07-05T09:27:53Z) - Waveform Selection for Radar Tracking in Target Channels With Memory via
Universal Learning [14.796960833031724]
Adapting the radar's waveform using partial information about the state of the scene has been shown to provide performance benefits in many practical scenarios.
This work examines a radar system which builds a compressed model of the radar-environment interface in the form of a context-tree.
The proposed approach is tested in a simulation study, and is shown to provide tracking performance improvements over two state-of-the-art waveform selection schemes.
arXiv Detail & Related papers (2021-08-02T21:27:56Z) - Fake Visual Content Detection Using Two-Stream Convolutional Neural
Networks [14.781702606707642]
We propose a two-stream convolutional neural network architecture called TwoStreamNet to complement frequency and spatial domain features.
The proposed detector has demonstrated significant performance improvement compared to the current state-of-the-art fake content detectors.
arXiv Detail & Related papers (2021-01-03T18:05:07Z) - RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars.
We propose a new solution that exploits both LiDAR and Radar sensors for perception.
Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.