MetaGait: Learning to Learn an Omni Sample Adaptive Representation for
Gait Recognition
- URL: http://arxiv.org/abs/2306.03445v1
- Date: Tue, 6 Jun 2023 06:53:05 GMT
- Title: MetaGait: Learning to Learn an Omni Sample Adaptive Representation for
Gait Recognition
- Authors: Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, and Xi Li
- Abstract summary: We develop a novel MetaGait that learns to learn an omni sample adaptive representation.
We leverage the meta-knowledge across the entire process, where Meta Triple Attention and Meta Temporal Pooling are presented.
Extensive experiments demonstrate the state-of-the-art performance of the proposed MetaGait.
- Score: 16.26377062742576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gait recognition, which aims at identifying individuals by their walking
patterns, has recently drawn increasing research attention. However, gait
recognition still suffers from the conflicts between the limited binary visual
clues of the silhouette and numerous covariates with diverse scales, which
brings challenges to the model's adaptiveness. In this paper, we address this
conflict by developing a novel MetaGait that learns to learn an omni sample
adaptive representation. Towards this goal, MetaGait injects meta-knowledge,
which could guide the model to perceive sample-specific properties, into the
calibration network of the attention mechanism to improve the adaptiveness from
the omni-scale, omni-dimension, and omni-process perspectives. Specifically, we
leverage the meta-knowledge across the entire process, where Meta Triple
Attention and Meta Temporal Pooling are presented respectively to adaptively
capture omni-scale dependency from spatial/channel/temporal dimensions
simultaneously and to adaptively aggregate temporal information through
integrating the merits of three complementary temporal aggregation methods.
Extensive experiments demonstrate the state-of-the-art performance of the
proposed MetaGait. On CASIA-B, we achieve rank-1 accuracy of 98.7%, 96.0%, and
89.3% under three conditions, respectively. On OU-MVLP, we achieve rank-1
accuracy of 92.4%.
Related papers
- S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - The Paradox of Motion: Evidence for Spurious Correlations in
Skeleton-based Gait Recognition Models [4.089889918897877]
This study challenges the prevailing assumption that vision-based gait recognition relies primarily on motion patterns.
We show through a comparative analysis that removing height information leads to notable performance degradation.
We propose a spatial transformer model processing individual poses, disregarding any temporal information, which achieves unreasonably good accuracy.
arXiv Detail & Related papers (2024-02-13T09:33:12Z) - GaitASMS: Gait Recognition by Adaptive Structured Spatial Representation
and Multi-Scale Temporal Aggregation [2.0444600042188448]
Gait recognition is one of the most promising video-based biometric technologies.
We propose a novel gait recognition framework, denoted as GaitASMS.
It can effectively extract the adaptive structured spatial representations and naturally aggregate the multi-scale temporal information.
arXiv Detail & Related papers (2023-07-29T13:03:17Z) - GaitMPL: Gait Recognition with Memory-Augmented Progressive Learning [10.427640929715668]
Gait recognition aims at identifying the pedestrians at a long distance by their biometric gait patterns.
In this work, we propose to solve the hard sample issue with a Memory-augmented Progressive Learning network (GaitMPL)
Specifically, DRPL reduces the learning difficulty of hard samples by easy-to-hard progressive learning.
GSAM further augments DRPL with a structure-aligned memory mechanism, which maintains and models the feature distribution of each ID.
arXiv Detail & Related papers (2023-06-06T07:24:53Z) - GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition [34.07501669897291]
GaitGS is a framework that aggregates temporal features simultaneously in both granularity and span dimensions.
Our method demonstrates state-of-the-art performance, achieving Rank-1 accuracy of 98.2%, 96.5%, and 89.7% on two datasets.
arXiv Detail & Related papers (2023-05-31T09:48:25Z) - Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical
Fusion for Multimodal Affect Recognition [69.32305810128994]
Incongruity between modalities poses a challenge for multimodal fusion, especially in affect recognition.
We propose the Hierarchical Crossmodal Transformer with Dynamic Modality Gating (HCT-DMG), a lightweight incongruity-aware model.
HCT-DMG: 1) outperforms previous multimodal models with a reduced size of approximately 0.8M parameters; 2) recognizes hard samples where incongruity makes affect recognition difficult; 3) mitigates the incongruity at the latent level in crossmodal attention.
arXiv Detail & Related papers (2023-05-23T01:24:15Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Multi-scale Context-aware Network with Transformer for Gait Recognition [35.521073630044434]
We propose a multi-scale context-aware network with transformer (MCAT) for gait recognition.
MCAT generates temporal features across three scales, and adaptively aggregates them using contextual information from both local and global perspectives.
In order to remedy the spatial feature corruption resulting from temporal operations, MCAT incorporates a salient spatial feature learning (SSFL) module.
arXiv Detail & Related papers (2022-04-07T07:47:21Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z) - A Variational Information Bottleneck Approach to Multi-Omics Data
Integration [98.6475134630792]
We propose a deep variational information bottleneck (IB) approach for incomplete multi-view observations.
Our method applies the IB framework on marginal and joint representations of the observed views to focus on intra-view and inter-view interactions that are relevant for the target.
Experiments on real-world datasets show that our method consistently achieves gain from data integration and outperforms state-of-the-art benchmarks.
arXiv Detail & Related papers (2021-02-05T06:05:39Z) - HM4: Hidden Markov Model with Memory Management for Visual Place
Recognition [54.051025148533554]
We develop a Hidden Markov Model approach for visual place recognition in autonomous driving.
Our algorithm, dubbed HM$4$, exploits temporal look-ahead to transfer promising candidate images between passive storage and active memory.
We show that this allows constant time and space inference for a fixed coverage area.
arXiv Detail & Related papers (2020-11-01T08:49:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.