Spatio-temporal Gait Feature with Adaptive Distance Alignment
- URL: http://arxiv.org/abs/2203.03376v1
- Date: Mon, 7 Mar 2022 13:34:00 GMT
- Title: Spatio-temporal Gait Feature with Adaptive Distance Alignment
- Authors: Xuelong Li, Yifan Chen, Jingran Su, Yang Zhao
- Abstract summary: We try to increase the difference of gait features of different subjects from two aspects: the optimization of network structure and the refinement of extracted gait features.
Our method is proposed, it consists of Spatio-temporal Feature Extraction (SFE) and Adaptive Distance Alignment (ADA)
ADA uses a large number of unlabeled gait data in real life as a benchmark to refine the extracted-temporal features to make them have low inter-class similarity and high intra-class similarity.
- Score: 90.5842782685509
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gait recognition is an important recognition technology, because it is not
easy to camouflage and does not need cooperation to recognize subjects.
However, there are still serious challenges in gait recognition, that is,
people with similar walking posture are often recognized incorrectly. In this
paper, We try to increase the difference of gait features of different subjects
from two aspects: the optimization of network structure and the refinement of
extracted gait features, so as to increase the recognition efficiency of
subjects with similar walking posture. So our method is proposed, it consists
of Spatio-temporal Feature Extraction (SFE) and Adaptive Distance Alignment
(ADA), which SFE uses Temporal Feature Fusion (TFF) and Fine-grained Feature
Extraction (FFE) to effectively extract the spatio-temporal features from raw
silhouettes, ADA uses a large number of unlabeled gait data in real life as a
benchmark to refine the extracted spatio-temporal features to make them have
low inter-class similarity and high intra-class similarity. Extensive
experiments on mini-OUMVLP and CASIA-B have proved that we have a good result
than some state-of-the-art methods.
Related papers
- GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition [36.55724380184354]
We propose CLTD, a discriminative feature learning module designed to eliminate the influence of confounders in triple domains, ie, spatial, temporal, and spectral.
Specifically, we utilize the Cross Pixel-wise Attention Generator (CPAG) to generate attention distributions for factual and counterfactual features in spatial and temporal domains.
Then, we introduce the Fourier Projection Head (FPH) to project spatial features into the spectral space, which preserves essential information while reducing computational costs.
arXiv Detail & Related papers (2024-07-17T12:16:44Z) - D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition [60.84084172829169]
Adapting large pre-trained image models to few-shot action recognition has proven to be an effective strategy for learning robust feature extractors.
We present the Disentangled-and-Deformable Spatio-Temporal Adapter (D$2$ST-Adapter), which is a novel tuning framework well-suited for few-shot action recognition.
arXiv Detail & Related papers (2023-12-03T15:40:10Z) - GLSFormer : Gated - Long, Short Sequence Transformer for Step
Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches.
We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z) - GaitMAST: Motion-Aware Spatio-Temporal Feature Learning Network for
Cross-View Gait Recognition [32.76653659564304]
We propose GaitMAST, which can unleash the potential of motion-aware features.
GitMAST preserves the individual's unique walking patterns well.
Our model achieves an average rank-1 accuracy of 98.1%.
arXiv Detail & Related papers (2022-10-21T08:42:00Z) - Spatiotemporal Multi-scale Bilateral Motion Network for Gait Recognition [3.1240043488226967]
In this paper, motivated by optical flow, the bilateral motion-oriented features are proposed.
We develop a set of multi-scale temporal representations that force the motion context to be richly described at various levels of temporal resolution.
arXiv Detail & Related papers (2022-09-26T01:36:22Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Efficient Modelling Across Time of Human Actions and Interactions [92.39082696657874]
We argue that current fixed-sized-temporal kernels in 3 convolutional neural networks (CNNDs) can be improved to better deal with temporal variations in the input.
We study how we can better handle between classes of actions, by enhancing their feature differences over different layers of the architecture.
The proposed approaches are evaluated on several benchmark action recognition datasets and show competitive results.
arXiv Detail & Related papers (2021-10-05T15:39:11Z) - ConCAD: Contrastive Learning-based Cross Attention for Sleep Apnea
Detection [16.938983046369263]
We propose a contrastive learning-based cross attention framework for sleep apnea detection (named ConCAD)
Our proposed framework can be easily integrated into standard deep learning models to utilize expert knowledge and contrastive learning to boost performance.
arXiv Detail & Related papers (2021-05-07T02:38:56Z) - SelfGait: A Spatiotemporal Representation Learning Method for
Self-supervised Gait Recognition [24.156710529672775]
Gait recognition plays a vital role in human identification since gait is a unique biometric feature that can be perceived at a distance.
Existing gait recognition methods can learn gait features from gait sequences in different ways, but the performance of gait recognition suffers from labeled data.
We propose a self-supervised gait recognition method, termed SelfGait, which takes advantage of the massive, diverse, unlabeled gait data as a pre-training process.
arXiv Detail & Related papers (2021-03-27T05:15:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.