View-Invariant Gait Recognition with Attentive Recurrent Learning of
Partial Representations
- URL: http://arxiv.org/abs/2010.09092v1
- Date: Sun, 18 Oct 2020 20:20:43 GMT
- Title: View-Invariant Gait Recognition with Attentive Recurrent Learning of
Partial Representations
- Authors: Alireza Sepas-Moghaddam, Ali Etemad
- Abstract summary: We propose a network that first learns to extract gait convolutional energy maps (GCEM) from frame-level convolutional features.
It then adopts a bidirectional neural network to learn from split bins of the GCEM, thus exploiting the relations between learned partial recurrent representations.
Our proposed model has been extensively tested on two large-scale CASIA-B and OU-M gait datasets.
- Score: 27.33579145744285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gait recognition refers to the identification of individuals based on
features acquired from their body movement during walking. Despite the recent
advances in gait recognition with deep learning, variations in data acquisition
and appearance, namely camera angles, subject pose, occlusions, and clothing,
are challenging factors that need to be considered for achieving accurate gait
recognition systems. In this paper, we propose a network that first learns to
extract gait convolutional energy maps (GCEM) from frame-level convolutional
features. It then adopts a bidirectional recurrent neural network to learn from
split bins of the GCEM, thus exploiting the relations between learned partial
spatiotemporal representations. We then use an attention mechanism to
selectively focus on important recurrently learned partial representations as
identity information in different scenarios may lie in different GCEM bins. Our
proposed model has been extensively tested on two large-scale CASIA-B and
OU-MVLP gait datasets using four different test protocols and has been compared
to a number of state-of-the-art and baseline solutions. Additionally, a
comprehensive experiment has been performed to study the robustness of our
model in the presence of six different synthesized occlusions. The experimental
results show the superiority of our proposed method, outperforming the
state-of-the-art, especially in scenarios where different clothing and carrying
conditions are encountered. The results also revealed that our model is more
robust against different occlusions as compared to the state-of-the-art
methods.
Related papers
- The Paradox of Motion: Evidence for Spurious Correlations in
Skeleton-based Gait Recognition Models [4.089889918897877]
This study challenges the prevailing assumption that vision-based gait recognition relies primarily on motion patterns.
We show through a comparative analysis that removing height information leads to notable performance degradation.
We propose a spatial transformer model processing individual poses, disregarding any temporal information, which achieves unreasonably good accuracy.
arXiv Detail & Related papers (2024-02-13T09:33:12Z) - DCID: Deep Canonical Information Decomposition [84.59396326810085]
We consider the problem of identifying the signal shared between two one-dimensional target variables.
We propose ICM, an evaluation metric which can be used in the presence of ground-truth labels.
We also propose Deep Canonical Information Decomposition (DCID) - a simple, yet effective approach for learning the shared variables.
arXiv Detail & Related papers (2023-06-27T16:59:06Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - Combining the Silhouette and Skeleton Data for Gait Recognition [13.345465199699]
Two dominant gait recognition works are appearance-based and model-based, which extract features from silhouettes and skeletons, respectively.
This paper proposes a CNN-based branch taking silhouettes as input and a GCN-based branch taking skeletons as input.
For better gait representation in the GCN-based branch, we present a fully connected graph convolution operator to integrate multi-scale graph convolutions.
arXiv Detail & Related papers (2022-02-22T03:21:51Z) - Dual Contrastive Learning for General Face Forgery Detection [64.41970626226221]
We propose a novel face forgery detection framework, named Dual Contrastive Learning (DCL), which constructs positive and negative paired data.
To explore the essential discrepancies, Intra-Instance Contrastive Learning (Intra-ICL) is introduced to focus on the local content inconsistencies prevalent in the forged faces.
arXiv Detail & Related papers (2021-12-27T05:44:40Z) - Subject-Independent Drowsiness Recognition from Single-Channel EEG with
an Interpretable CNN-LSTM model [0.8250892979520543]
We propose a novel Convolutional Neural Network (CNN)-Long Short-Term Memory (LSTM) model for subject-independent drowsiness recognition from single-channel EEG signals.
Results show that the model achieves an average accuracy of 72.97% on 11 subjects for leave-one-out subject-independent drowsiness recognition on a public dataset.
arXiv Detail & Related papers (2021-11-21T10:37:35Z) - Deep Collaborative Multi-Modal Learning for Unsupervised Kinship
Estimation [53.62256887837659]
Kinship verification is a long-standing research challenge in computer vision.
We propose a novel deep collaborative multi-modal learning (DCML) to integrate the underlying information presented in facial properties.
Our DCML method is always superior to some state-of-the-art kinship verification methods.
arXiv Detail & Related papers (2021-09-07T01:34:51Z) - Exploiting Emotional Dependencies with Graph Convolutional Networks for
Facial Expression Recognition [31.40575057347465]
This paper proposes a novel multi-task learning framework to recognize facial expressions in-the-wild.
A shared feature representation is learned for both discrete and continuous recognition in a MTL setting.
The results of our experiments show that our method outperforms the current state-of-the-art methods on discrete FER.
arXiv Detail & Related papers (2021-06-07T10:20:05Z) - Distribution Alignment: A Unified Framework for Long-tail Visual
Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition.
We then introduce a generalized re-weight method in the two-stage learning to balance the class prior.
Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z) - Gait Recognition using Multi-Scale Partial Representation Transformation
with Capsules [22.99694601595627]
We propose a novel deep network, learning to transfer multi-scale partial gait representations using capsules.
Our network first obtains multi-scale partial representations using a state-of-the-art deep partial feature extractor.
It then recurrently learns the correlations and co-occurrences of the patterns among the partial features in forward and backward directions.
arXiv Detail & Related papers (2020-10-18T19:47:38Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.