Related papers: BigGait: Learning Gait Representation You Want by Large Vision Models

BigGait: Learning Gait Representation You Want by Large Vision Models

URL: http://arxiv.org/abs/2402.19122v2
Date: Fri, 22 Mar 2024 07:03:54 GMT
Title: BigGait: Learning Gait Representation You Want by Large Vision Models
Authors: Dingqiang Ye, Chao Fan, Jingzhe Ma, Xiaoming Liu, Shiqi Yu,
Abstract summary: Existing gait recognition methods rely on task-specific upstream driven by supervised learning to provide explicit gait representations. Escaping from this trend, this work proposes a simple yet efficient gait framework, termed BigGait. BigGait transforms all-purpose knowledge into implicit gait representations without requiring third-party supervision signals.
Score: 12.620774996969535
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Gait recognition stands as one of the most pivotal remote identification technologies and progressively expands across research and industry communities. However, existing gait recognition methods heavily rely on task-specific upstream driven by supervised learning to provide explicit gait representations like silhouette sequences, which inevitably introduce expensive annotation costs and potential error accumulation. Escaping from this trend, this work explores effective gait representations based on the all-purpose knowledge produced by task-agnostic Large Vision Models (LVMs) and proposes a simple yet efficient gait framework, termed BigGait. Specifically, the Gait Representation Extractor (GRE) within BigGait draws upon design principles from established gait representations, effectively transforming all-purpose knowledge into implicit gait representations without requiring third-party supervision signals. Experiments on CCPG, CAISA-B* and SUSTech1K indicate that BigGait significantly outperforms the previous methods in both within-domain and cross-domain tasks in most cases, and provides a more practical paradigm for learning the next-generation gait representation. Finally, we delve into prospective challenges and promising directions in LVMs-based gait recognition, aiming to inspire future work in this emerging topic. The source code is available at https://github.com/ShiqiYu/OpenGait.

Related papers

GaitAdapt: Continual Learning for Evolving Gait Recognition [8.11771678547237]
We present a continual gait recognition task, GaitAdapt, which supports the progressive enhancement of gait recognition capabilities over time.<n>We also propose GaitAdapter, a non-replay continual learning approach for gait recognition.<n>GitAdapter effectively retains gait knowledge acquired from diverse tasks, exhibiting markedly superior discriminative capability compared to alternative methods.
arXiv Detail & Related papers (2025-08-05T12:26:52Z)
Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z)
BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models [16.21103558769559]
This work investigates the impact of layer-wise representations on downstream recognition tasks.<n>We propose a simple and universal baseline for LVM-based gait recognition, termed BiggerGait.<n> Comprehensive evaluations on CCPG, CAISA-B*, SUSTech1K, and CCGR_MINI validate the superiority of BiggerGait across both within- and cross-domain tasks.
arXiv Detail & Related papers (2025-05-23T17:41:54Z)
Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook [85.43403500874889]
Retrieval-augmented generation (RAG) has emerged as a pivotal technique in artificial intelligence (AI) Recent advancements in RAG for embodied AI, with a particular focus on applications in planning, task execution, multimodal perception, interaction, and specialized domains.
arXiv Detail & Related papers (2025-03-23T10:33:28Z)
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents [27.90338725230132]
ViDoSeek is a dataset designed to evaluate RAG performance on visually rich documents requiring complex reasoning. We propose ViDoRAG, a novel multi-agent RAG framework tailored for complex reasoning across visual documents. Notably, ViDoRAG outperforms existing methods by over 10% on the competitive ViDoSeek benchmark.
arXiv Detail & Related papers (2025-02-25T09:26:12Z)
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation [65.23793829741014]
Embodied-RAG is a framework that enhances the model of an embodied agent with a non-parametric memory system. At its core, Embodied-RAG's memory is structured as a semantic forest, storing language descriptions at varying levels of detail. We demonstrate that Embodied-RAG effectively bridges RAG to the robotics domain, successfully handling over 200 explanation and navigation queries.
arXiv Detail & Related papers (2024-09-26T21:44:11Z)
From Linguistic Giants to Sensory Maestros: A Survey on Cross-Modal Reasoning with Large Language Models [56.9134620424985]
Cross-modal reasoning (CMR) is increasingly recognized as a crucial capability in the progression toward more sophisticated artificial intelligence systems. The recent trend of deploying Large Language Models (LLMs) to tackle CMR tasks has marked a new mainstream of approaches for enhancing their effectiveness. This survey offers a nuanced exposition of current methodologies applied in CMR using LLMs, classifying these into a detailed three-tiered taxonomy.
arXiv Detail & Related papers (2024-09-19T02:51:54Z)
Disentangled Generative Graph Representation Learning [51.59824683232925]
This paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework. It aims to learn latent disentangled factors and utilize them to guide graph mask modeling. Experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods.
arXiv Detail & Related papers (2024-08-24T05:13:02Z)
OpenGait: A Comprehensive Benchmark Study for Gait Recognition towards Better Practicality [11.64292241875791]
We first develop OpenGait, a flexible and efficient gait recognition platform. Using OpenGait as a foundation, we conduct in-depth ablation experiments to revisit recent developments in gait recognition. Inspired by these findings, we develop three structurally simple yet empirically powerful and practically robust baseline models.
arXiv Detail & Related papers (2024-05-15T07:11:12Z)
Exploring Deep Models for Practical Gait Recognition [11.185716724976414]
We present a unified perspective to explore how to construct deep models for state-of-the-art outdoor gait recognition. Specifically, we challenge the stereotype of shallow gait models and demonstrate the superiority of explicit temporal modeling. The proposed CNN-based DeepGaitV2 series and Transformer-based SwinGait series exhibit significant performance improvements on Gait3D and GREW.
arXiv Detail & Related papers (2023-03-06T17:19:28Z)
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes. We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z)
Learning Gait Representation from Massive Unlabelled Walking Videos: A Benchmark [11.948554539954673]
This paper proposes a large-scale self-supervised benchmark for gait recognition with contrastive learning. We collect a large-scale unlabelled gait dataset GaitLU-1M consisting of 1.02M walking sequences. We evaluate the pre-trained model on four widely-used gait benchmarks, CASIA-B, OU-M, GREW and Gait3D with or without transfer learning.
arXiv Detail & Related papers (2022-06-28T12:33:42Z)
Gait Recognition in the Wild: A Large-scale Benchmark and NAS-based Baseline [95.88825497452716]
Gait benchmarks empower the research community to train and evaluate high-performance gait recognition systems. GREW is the first large-scale dataset for gait recognition in the wild. SPOSGait is the first NAS-based gait recognition model.
arXiv Detail & Related papers (2022-05-05T14:57:39Z)
HEATGait: Hop-Extracted Adjacency Technique in Graph Convolution based Gait Recognition [0.0]
HEATGait is a gait recognition system that improves the existing multi-scale convolution graph by efficient hop-extraction technique to alleviate the issue. We propose a powerful feature extractor that utilizes ResG to achieve state-of-the-art performance in model-based gait recognition on the CASIA-BCN gait dataset.
arXiv Detail & Related papers (2022-04-21T16:13:58Z)
Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations. These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations. This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.