GaitSTR: Gait Recognition with Sequential Two-stream Refinement
- URL: http://arxiv.org/abs/2404.02345v1
- Date: Tue, 2 Apr 2024 22:39:35 GMT
- Title: GaitSTR: Gait Recognition with Sequential Two-stream Refinement
- Authors: Wanrong Zheng, Haidong Zhu, Zhaoheng Zheng, Ram Nevatia,
- Abstract summary: Gait recognition aims to identify a person based on their walking sequences, serving as a useful biometric modality.
In representing a person's walking sequence, silhouettes and skeletons are the two primary modalities used.
We explore the use of a two-stream representation of skeletons for gait recognition, alongside silhouettes.
- Score: 12.256802601846749
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gait recognition aims to identify a person based on their walking sequences, serving as a useful biometric modality as it can be observed from long distances without requiring cooperation from the subject. In representing a person's walking sequence, silhouettes and skeletons are the two primary modalities used. Silhouette sequences lack detailed part information when overlapping occurs between different body segments and are affected by carried objects and clothing. Skeletons, comprising joints and bones connecting the joints, provide more accurate part information for different segments; however, they are sensitive to occlusions and low-quality images, causing inconsistencies in frame-wise results within a sequence. In this paper, we explore the use of a two-stream representation of skeletons for gait recognition, alongside silhouettes. By fusing the combined data of silhouettes and skeletons, we refine the two-stream skeletons, joints, and bones through self-correction in graph convolution, along with cross-modal correction with temporal consistency from silhouettes. We demonstrate that with refined skeletons, the performance of the gait recognition model can achieve further improvement on public gait recognition datasets compared with state-of-the-art methods without extra annotations.
Related papers
- SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition [25.341177384559174]
We propose a novel approach called Skeletal-Temporal Transformer (SkateFormer)
SkateFormer partitions joints and frames based on different types of skeletal-temporal relation.
It can selectively focus on key joints and frames crucial for action recognition in an action-adaptive manner.
arXiv Detail & Related papers (2024-03-14T15:55:53Z) - SkeletonGait: Gait Recognition Using Skeleton Maps [7.335859292188816]
We introduce a novel skeletal gait representation named skeleton map, together with SkeletonGait, a skeleton-based method to exploit structural information from human skeleton maps.
Skeleton map represents the coordinates of human joints as a heatmap with Gaussian approximation, exhibiting a silhouette-like image devoid of exact body structure.
SkeletonGait++ outperforms existing state-of-the-art methods by a significant margin in various scenarios.
arXiv Detail & Related papers (2023-11-22T15:09:59Z) - SkeleTR: Towrads Skeleton-based Action Recognition in the Wild [86.03082891242698]
SkeleTR is a new framework for skeleton-based action recognition.
It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions.
It then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios.
arXiv Detail & Related papers (2023-09-20T16:22:33Z) - One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton
Matching [77.6989219290789]
One-shot skeleton action recognition aims to learn a skeleton action recognition model with a single training sample.
This paper presents a novel one-shot skeleton action recognition technique that handles skeleton action recognition via multi-scale spatial-temporal feature matching.
arXiv Detail & Related papers (2023-07-14T11:52:10Z) - GaitRef: Gait Recognition with Refined Sequential Skeletons [20.778107966302116]
Two common modalities used for representing the walking sequence of a person are silhouettes and joint skeletons.
In this paper, we combine the silhouettes and skeletons and refine the framewise joint predictions for gait recognition.
With temporal information from the silhouette sequences, we show that the refined skeletons can improve gait recognition performance without extra annotations.
arXiv Detail & Related papers (2023-04-16T23:37:24Z) - Skeleton Prototype Contrastive Learning with Multi-Level Graph Relation
Modeling for Unsupervised Person Re-Identification [63.903237777588316]
Person re-identification (re-ID) via 3D skeletons is an important emerging topic with many merits.
Existing solutions rarely explore valuable body-component relations in skeletal structure or motion.
This paper proposes a generic unsupervised Prototype Contrastive learning paradigm with Multi-level Graph Relation learning.
arXiv Detail & Related papers (2022-08-25T00:59:32Z) - Contrastive Learning from Spatio-Temporal Mixed Skeleton Sequences for
Self-Supervised Skeleton-Based Action Recognition [21.546894064451898]
We show that directly extending contrastive pairs based on normal augmentations brings limited returns in terms of performance.
We propose SkeleMixCLR: a contrastive learning framework with atemporal skeleton mixing augmentation (SkeleMix) to complement current contrastive learning approaches.
arXiv Detail & Related papers (2022-07-07T03:18:09Z) - Simultaneous Bone and Shadow Segmentation Network using Task
Correspondence Consistency [60.378180265885945]
We propose a single end-to-end network with a shared transformer-based encoder and task independent decoders for simultaneous bone and shadow segmentation.
We also introduce a correspondence consistency loss which makes sure that network utilizes the inter-dependency between the bone surface and its corresponding shadow to refine the segmentation.
arXiv Detail & Related papers (2022-06-16T22:37:05Z) - SimMC: Simple Masked Contrastive Learning of Skeleton Representations
for Unsupervised Person Re-Identification [63.903237777588316]
We present a generic Simple Masked Contrastive learning (SimMC) framework to learn effective representations from unlabeled 3D skeletons for person re-ID.
Specifically, to fully exploit skeleton features within each skeleton sequence, we first devise a masked prototype contrastive learning (MPC) scheme.
Then, we propose the masked intra-sequence contrastive learning (MIC) to capture intra-sequence pattern consistency between subsequences.
arXiv Detail & Related papers (2022-04-21T00:19:38Z) - Learning Rich Features for Gait Recognition by Integrating Skeletons and
Silhouettes [20.766540020533803]
This paper proposes a simple yet effective bimodal fusion network, which mines the complementary clues of skeletons and silhouettes to learn rich features for gait identification.
Under the most challenging condition of walking in different clothes on CASIA-B, our method achieves the rank-1 accuracy of 92.1%.
arXiv Detail & Related papers (2021-10-26T04:42:24Z) - Skeleton-Aware Networks for Deep Motion Retargeting [83.65593033474384]
We introduce a novel deep learning framework for data-driven motion between skeletons.
Our approach learns how to retarget without requiring any explicit pairing between the motions in the training set.
arXiv Detail & Related papers (2020-05-12T12:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.