GaitStrip: Gait Recognition via Effective Strip-based Feature
Representations and Multi-Level Framework
- URL: http://arxiv.org/abs/2203.03966v1
- Date: Tue, 8 Mar 2022 09:49:48 GMT
- Title: GaitStrip: Gait Recognition via Effective Strip-based Feature
Representations and Multi-Level Framework
- Authors: Ming Wang, Beibei Lin, Xianda Guo, Lincheng Li, Zheng Zhu, Jiande Sun,
Shunli Zhang and Xin Yu
- Abstract summary: We present a strip-based multi-level gait recognition network, named GaitStrip, to extract comprehensive gait information at different levels.
To be specific, our high-level branch explores the context of gait sequences and our low-level one focuses on detailed posture changes.
Our GaitStrip achieves state-of-the-art performance in both normal walking and complex conditions.
- Score: 34.397404430838286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many gait recognition methods first partition the human gait into N-parts and
then combine them to establish part-based feature representations. Their gait
recognition performance is often affected by partitioning strategies, which are
empirically chosen in different datasets. However, we observe that strips as
the basic component of parts are agnostic against different partitioning
strategies. Motivated by this observation, we present a strip-based multi-level
gait recognition network, named GaitStrip, to extract comprehensive gait
information at different levels. To be specific, our high-level branch explores
the context of gait sequences and our low-level one focuses on detailed posture
changes. We introduce a novel StriP-Based feature extractor (SPB) to learn the
strip-based feature representations by directly taking each strip of the human
body as the basic unit. Moreover, we propose a novel multi-branch structure,
called Enhanced Convolution Module (ECM), to extract different representations
of gaits. ECM consists of the Spatial-Temporal feature extractor (ST), the
Frame-Level feature extractor (FL) and SPB, and has two obvious advantages:
First, each branch focuses on a specific representation, which can be used to
improve the robustness of the network. Specifically, ST aims to extract
spatial-temporal features of gait sequences, while FL is used to generate the
feature representation of each frame. Second, the parameters of the ECM can be
reduced in test by introducing a structural re-parameterization technique.
Extensive experimental results demonstrate that our GaitStrip achieves
state-of-the-art performance in both normal walking and complex conditions.
Related papers
- It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment [72.75844404617959]
This paper proposes a novel cross-granularity alignment gait recognition method, named XGait.
To achieve this goal, the XGait first contains two branches of backbone encoders to map the silhouette sequences and the parsing sequences into two latent spaces.
Comprehensive experiments on two large-scale gait datasets show XGait with the Rank-1 accuracy of 80.5% on Gait3D and 88.3% CCPG.
arXiv Detail & Related papers (2024-11-16T08:54:27Z) - A Refreshed Similarity-based Upsampler for Direct High-Ratio Feature Upsampling [54.05517338122698]
We propose an explicitly controllable query-key feature alignment from both semantic-aware and detail-aware perspectives.
We also develop a fine-grained neighbor selection strategy on HR features, which is simple yet effective for alleviating mosaic artifacts.
Our proposed ReSFU framework consistently achieves satisfactory performance on different segmentation applications.
arXiv Detail & Related papers (2024-07-02T14:12:21Z) - DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and
Authentication [50.017055360261665]
We introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks.
For better feature interaction between these two branches, we introduce two specialized modules.
In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings.
arXiv Detail & Related papers (2024-02-03T06:49:42Z) - GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition [6.517046095186713]
Gait recognition aims to distinguish different walking patterns by analyzing video-level human silhouettes, rather than relying on appearance information.
Previous research has primarily focused on extracting local or global-temporal representations, while overlooking the intrinsic periodic features of gait sequences.
We propose a plug-and-play strategy, called Temporal Periodic Alignment (TPA), which leverages the periodic nature and fine-grained temporal dependencies of gait patterns.
arXiv Detail & Related papers (2023-07-25T05:05:07Z) - Hierarchical Spatio-Temporal Representation Learning for Gait
Recognition [6.877671230651998]
Gait recognition is a biometric technique that identifies individuals by their unique walking styles.
We propose a hierarchical-temporal representation learning framework for extracting gait features from coarse to fine.
Our method outperforms the state-of-the-art while maintaining a reasonable balance between model accuracy and complexity.
arXiv Detail & Related papers (2023-07-19T09:30:00Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - GaitMM: Multi-Granularity Motion Sequence Learning for Gait Recognition [6.877671230651998]
Gait recognition aims to identify individual-specific walking patterns by observing the different periodic movements of each body part.
Most existing methods treat each part equally and fail to account for the data redundancy caused by the different step frequencies and sampling rates of gait.
In this study, we propose a multi-granularity motion representation (GaitMM) for gait sequence learning.
arXiv Detail & Related papers (2022-09-18T04:07:33Z) - CFNet: Learning Correlation Functions for One-Stage Panoptic
Segmentation [46.252118473248316]
We propose to first predict semantic-level and instance-level correlations among different locations that are utilized to enhance the backbone features.
We then feed the improved discriminative features into the corresponding segmentation heads, respectively.
We achieve state-of-the-art performance on MS with $45.1$% PQ and ADE20k with $32.6$% PQ.
arXiv Detail & Related papers (2022-01-13T05:31:14Z) - Improving Video Instance Segmentation via Temporal Pyramid Routing [61.10753640148878]
Video Instance (VIS) is a new and inherently multi-task problem, which aims to detect, segment and track each instance in a video sequence.
We propose a Temporal Pyramid Routing (TPR) strategy to conditionally align and conduct pixel-level aggregation from a feature pyramid pair of two adjacent frames.
Our approach is a plug-and-play module and can be easily applied to existing instance segmentation methods.
arXiv Detail & Related papers (2021-07-28T03:57:12Z) - Spatio-Temporal Representation Factorization for Video-based Person
Re-Identification [55.01276167336187]
We propose Spatio-Temporal Representation Factorization module (STRF) for re-ID.
STRF is a flexible new computational unit that can be used in conjunction with most existing 3D convolutional neural network architectures for re-ID.
We empirically show that STRF improves performance of various existing baseline architectures while demonstrating new state-of-the-art results.
arXiv Detail & Related papers (2021-07-25T19:29:37Z) - Sequential convolutional network for behavioral pattern extraction in
gait recognition [0.7874708385247353]
We propose a sequential convolutional network (SCN) to learn the walking pattern of individuals.
In SCN, behavioral information extractors (BIE) are constructed to comprehend intermediate feature maps in time series.
A multi-frame aggregator in SCN performs feature integration on a sequence whose length is uncertain, via a mobile 3D convolutional layer.
arXiv Detail & Related papers (2021-04-23T08:44:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.