The Paradox of Motion: Evidence for Spurious Correlations in
Skeleton-based Gait Recognition Models
- URL: http://arxiv.org/abs/2402.08320v1
- Date: Tue, 13 Feb 2024 09:33:12 GMT
- Title: The Paradox of Motion: Evidence for Spurious Correlations in
Skeleton-based Gait Recognition Models
- Authors: Andy C\u{a}trun\u{a}, Adrian Cosma, Emilian R\u{a}doi
- Abstract summary: This study challenges the prevailing assumption that vision-based gait recognition relies primarily on motion patterns.
We show through a comparative analysis that removing height information leads to notable performance degradation.
We propose a spatial transformer model processing individual poses, disregarding any temporal information, which achieves unreasonably good accuracy.
- Score: 4.089889918897877
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Gait, an unobtrusive biometric, is valued for its capability to identify
individuals at a distance, across external outfits and environmental
conditions. This study challenges the prevailing assumption that vision-based
gait recognition, in particular skeleton-based gait recognition, relies
primarily on motion patterns, revealing a significant role of the implicit
anthropometric information encoded in the walking sequence. We show through a
comparative analysis that removing height information leads to notable
performance degradation across three models and two benchmarks (CASIA-B and
GREW). Furthermore, we propose a spatial transformer model processing
individual poses, disregarding any temporal information, which achieves
unreasonably good accuracy, emphasizing the bias towards appearance information
and indicating spurious correlations in existing benchmarks. These findings
underscore the need for a nuanced understanding of the interplay between motion
and appearance in vision-based gait recognition, prompting a reevaluation of
the methodological assumptions in this field. Our experiments indicate that
"in-the-wild" datasets are less prone to spurious correlations, prompting the
need for more diverse and large scale datasets for advancing the field.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Cross-Database Liveness Detection: Insights from Comparative Biometric
Analysis [20.821562115822182]
Liveness detection is the capability to differentiate between genuine and spoofed biometric samples.
This research presents a comprehensive evaluation of liveness detection models.
Our work offers a blueprint for navigating the evolving rhythms of biometric security.
arXiv Detail & Related papers (2024-01-29T15:32:18Z) - General Identifiability and Achievability for Causal Representation
Learning [33.80247458590611]
The paper establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph.
For identifiability, the paper establishes that perfect recovery of the latent causal model and variables is guaranteed under uncoupled interventions.
The analysis, additionally, recovers the identifiability result for two hard coupled interventions, that is when metadata about the pair of environments that have the same node intervened is known.
arXiv Detail & Related papers (2023-10-24T01:47:44Z) - Distillation-guided Representation Learning for Unconstrained Gait Recognition [50.0533243584942]
We propose a framework, termed GAit DEtection and Recognition (GADER), for human authentication in challenging outdoor scenarios.
GADER builds discriminative features through a novel gait recognition method, where only frames containing gait information are used.
We evaluate our method on multiple State-of-The-Arts(SoTA) gait baselines and demonstrate consistent improvements on indoor and outdoor datasets.
arXiv Detail & Related papers (2023-07-27T01:53:57Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - An Enhanced Adversarial Network with Combined Latent Features for
Spatio-Temporal Facial Affect Estimation in the Wild [1.3007851628964147]
This paper proposes a novel model that efficiently extracts both spatial and temporal features of the data by means of its enhanced temporal modelling based on latent features.
Our proposed model consists of three major networks, coined Generator, Discriminator, and Combiner, which are trained in an adversarial setting combined with curriculum learning to enable our adaptive attention modules.
arXiv Detail & Related papers (2021-02-18T04:10:12Z) - A Variational Information Bottleneck Approach to Multi-Omics Data
Integration [98.6475134630792]
We propose a deep variational information bottleneck (IB) approach for incomplete multi-view observations.
Our method applies the IB framework on marginal and joint representations of the observed views to focus on intra-view and inter-view interactions that are relevant for the target.
Experiments on real-world datasets show that our method consistently achieves gain from data integration and outperforms state-of-the-art benchmarks.
arXiv Detail & Related papers (2021-02-05T06:05:39Z) - View-Invariant Gait Recognition with Attentive Recurrent Learning of
Partial Representations [27.33579145744285]
We propose a network that first learns to extract gait convolutional energy maps (GCEM) from frame-level convolutional features.
It then adopts a bidirectional neural network to learn from split bins of the GCEM, thus exploiting the relations between learned partial recurrent representations.
Our proposed model has been extensively tested on two large-scale CASIA-B and OU-M gait datasets.
arXiv Detail & Related papers (2020-10-18T20:20:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.