Parsing is All You Need for Accurate Gait Recognition in the Wild
- URL: http://arxiv.org/abs/2308.16739v1
- Date: Thu, 31 Aug 2023 13:57:38 GMT
- Title: Parsing is All You Need for Accurate Gait Recognition in the Wild
- Authors: Jinkai Zheng, Xinchen Liu, Shuai Wang, Lihao Wang, Chenggang Yan, Wu
Liu
- Abstract summary: This paper presents a novel gait representation, named Gait Parsing Sequence (GPS)
GPSs are sequences of fine-grained human segmentation, extracted from video frames, so they have much higher information entropy.
We also propose a novel human parsing-based gait recognition framework, named ParsingGait.
The experimental results show a significant improvement in accuracy brought by the GPS representation and the superiority of ParsingGait.
- Score: 51.206166843375364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary silhouettes and keypoint-based skeletons have dominated human gait
recognition studies for decades since they are easy to extract from video
frames. Despite their success in gait recognition for in-the-lab environments,
they usually fail in real-world scenarios due to their low information entropy
for gait representations. To achieve accurate gait recognition in the wild,
this paper presents a novel gait representation, named Gait Parsing Sequence
(GPS). GPSs are sequences of fine-grained human segmentation, i.e., human
parsing, extracted from video frames, so they have much higher information
entropy to encode the shapes and dynamics of fine-grained human parts during
walking. Moreover, to effectively explore the capability of the GPS
representation, we propose a novel human parsing-based gait recognition
framework, named ParsingGait. ParsingGait contains a Convolutional Neural
Network (CNN)-based backbone and two light-weighted heads. The first head
extracts global semantic features from GPSs, while the other one learns mutual
information of part-level features through Graph Convolutional Networks to
model the detailed dynamics of human walking. Furthermore, due to the lack of
suitable datasets, we build the first parsing-based dataset for gait
recognition in the wild, named Gait3D-Parsing, by extending the large-scale and
challenging Gait3D dataset. Based on Gait3D-Parsing, we comprehensively
evaluate our method and existing gait recognition methods. The experimental
results show a significant improvement in accuracy brought by the GPS
representation and the superiority of ParsingGait. The code and dataset are
available at https://gait3d.github.io/gait3d-parsing-hp .
Related papers
- It Takes Two: Accurate Gait Recognition in the Wild via Cross-granularity Alignment [72.75844404617959]
This paper proposes a novel cross-granularity alignment gait recognition method, named XGait.
To achieve this goal, the XGait first contains two branches of backbone encoders to map the silhouette sequences and the parsing sequences into two latent spaces.
Comprehensive experiments on two large-scale gait datasets show XGait with the Rank-1 accuracy of 80.5% on Gait3D and 88.3% CCPG.
arXiv Detail & Related papers (2024-11-16T08:54:27Z) - Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition [57.97930719585095]
We introduce Part-aware Unified Representation between Language and Skeleton (PURLS) to explore visual-semantic alignment at both local and global scales.
Our approach is evaluated on various skeleton/language backbones and three large-scale datasets.
The results showcase the universality and superior performance of PURLS, surpassing prior skeleton-based solutions and standard baselines from other domains.
arXiv Detail & Related papers (2024-06-19T08:22:32Z) - GaitContour: Efficient Gait Recognition based on a Contour-Pose Representation [38.39173742709181]
Gait recognition holds the promise to robustly identify subjects based on walking patterns instead of appearance information.
In this work, we propose a novel, point-based Contour-Pose representation, which compactly expresses both body shape and body parts information.
We further propose a local-to-global architecture, called GaitContour, to leverage this novel representation.
arXiv Detail & Related papers (2023-11-27T17:06:25Z) - Distillation-guided Representation Learning for Unconstrained Gait Recognition [50.0533243584942]
We propose a framework, termed GAit DEtection and Recognition (GADER), for human authentication in challenging outdoor scenarios.
GADER builds discriminative features through a novel gait recognition method, where only frames containing gait information are used.
We evaluate our method on multiple State-of-The-Arts(SoTA) gait baselines and demonstrate consistent improvements on indoor and outdoor datasets.
arXiv Detail & Related papers (2023-07-27T01:53:57Z) - Integrating Human Parsing and Pose Network for Human Action Recognition [12.308394270240463]
We introduce human parsing feature map as a novel modality for action recognition.
We propose Integrating Human Parsing and Pose Network (IPP-Net) for action recognition.
IPP-Net is the first to leverage both skeletons and human parsing feature maps dualbranch approach.
arXiv Detail & Related papers (2023-07-16T07:58:29Z) - Towards a Deeper Understanding of Skeleton-based Gait Recognition [4.812321790984493]
In recent years, most gait recognition methods used the person's silhouette to extract the gait features.
Model-based methods do not suffer from these problems and are able to represent the temporal motion of body joints.
In this work, we propose an approach based on Graph Convolutional Networks (GCNs) that combines higher-order inputs, and residual networks.
arXiv Detail & Related papers (2022-04-16T18:23:37Z) - Vision-based Behavioral Recognition of Novelty Preference in Pigs [1.837722971703011]
Behavioral scoring of research data is crucial for extracting domain-specific metrics but is bottlenecked on the ability to analyze enormous volumes of information using human labor.
Deep learning is widely viewed as a key advancement to relieve this bottleneck.
We identify one such domain, where deep learning can be leveraged to alleviate the process of manual scoring.
arXiv Detail & Related papers (2021-06-23T06:10:34Z) - GPRAR: Graph Convolutional Network based Pose Reconstruction and Action
Recognition for Human Trajectory Prediction [1.2891210250935146]
Existing prediction models are easily prone to errors in real-world settings where observations are often noisy.
We introduce GPRAR, a graph convolutional network based pose reconstruction and action recognition for human trajectory prediction.
We show that GPRAR improves the prediction accuracy up to 22% and 50% under noisy observations on JAAD and TITAN datasets.
arXiv Detail & Related papers (2021-03-25T20:12:14Z) - Self-supervised Video Representation Learning by Uncovering
Spatio-temporal Statistics [74.6968179473212]
This paper proposes a novel pretext task to address the self-supervised learning problem.
We compute a series of partitioning-temporal statistical summaries, such as the spatial location and dominant direction of the largest motion.
A neural network is built and trained to yield the statistical summaries given the video frames as inputs.
arXiv Detail & Related papers (2020-08-31T08:31:56Z) - GPS-Net: Graph Property Sensing Network for Scene Graph Generation [91.60326359082408]
Scene graph generation (SGG) aims to detect objects in an image along with their pairwise relationships.
GPS-Net fully explores three properties for SGG: edge direction information, the difference in priority between nodes, and the long-tailed distribution of relationships.
GPS-Net achieves state-of-the-art performance on three popular databases: VG, OI, and VRD by significant gains under various settings and metrics.
arXiv Detail & Related papers (2020-03-29T07:22:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.