Fugu-MT 論文翻訳(概要): Language-Guided and Motion-Aware Gait Representation for Generalizable Recognition

論文の概要: Language-Guided and Motion-Aware Gait Representation for Generalizable Recognition

arxiv url: http://arxiv.org/abs/2601.11931v2
Date: Fri, 23 Jan 2026 11:54:11 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-26 14:27:27.287136
Title: Language-Guided and Motion-Aware Gait Representation for Generalizable Recognition
Title（参考訳）: 一般化可能な音声認識のための言語ガイドとモーションアウェア・ゲイト表現
Authors: Zhengxian Wu, Chuanrui Zhang, Shenao Jiang, Hangrui Xu, Zirui Liao, Luyuan Zhang, Huaqiu Li, Peng Jiao, Haoqian Wang,
Abstract要約: 本稿では,LMGait という言語誘導型歩行認識フレームワークを提案する。特に,歩行系列における重要な動きの特徴を捉えるために,歩行関連言語キューを設計した。複数のデータセットにまたがって広範な実験を行い、提案したネットワークの利点を実証した。
参考スコア（独自算出の注目度）: 21.772052273755808
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gait recognition is emerging as a promising technology and an innovative field within computer vision, with a wide range of applications in remote human identification. However, existing methods typically rely on complex architectures to directly extract features from images and apply pooling operations to obtain sequence-level representations. Such designs often lead to overfitting on static noise (e.g., clothing), while failing to effectively capture dynamic motion regions, such as the arms and legs. This bottleneck is particularly challenging in the presence of intra-class variation, where gait features of the same individual under different environmental conditions are significantly distant in the feature space. To address the above challenges, we present a Languageguided and Motion-aware gait recognition framework, named LMGait. To the best of our knowledge, LMGait is the first method to introduce natural language descriptions as explicit semantic priors into the gait recognition task. In particular, we utilize designed gait-related language cues to capture key motion features in gait sequences. To improve cross-modal alignment, we propose the Motion Awareness Module (MAM), which refines the language features by adaptively adjusting various levels of semantic information to ensure better alignment with the visual representations. Furthermore, we introduce the Motion Temporal Capture Module (MTCM) to enhance the discriminative capability of gait features and improve the model's motion tracking ability. We conducted extensive experiments across multiple datasets, and the results demonstrate the significant advantages of our proposed network. Specifically, our model achieved accuracies of 88.5%, 97.1%, and 97.5% on the CCPG, SUSTech1K, and CASIAB datasets, respectively, achieving state-of-the-art performance. Homepage: https://dingwu1021.github.io/LMGait/
Abstract（参考訳）: 歩行認識は、コンピュータビジョンにおける有望な技術と革新的な分野として登場しており、遠隔での人間の識別に幅広い応用がある。しかし、既存の手法は通常、画像から特徴を直接抽出し、シーケンスレベルの表現を得るためにプール操作を適用するために複雑なアーキテクチャに依存している。このような設計は、しばしば静的ノイズ(例えば衣服)に過度に適合するが、腕や脚などの動的運動領域を効果的に捉えない。このボトルネックは、異なる環境条件下で同じ個体の歩行特徴が特徴空間において著しく離れているクラス内変異の存在下で特に困難である。上記の課題に対処するため,LMGait という言語誘導・運動対応歩行認識フレームワークを提案する。我々の知る限り、LMGaitは、歩行認識タスクに明示的なセマンティック先行として自然言語記述を導入する最初の方法である。特に,歩行系列における重要な動きの特徴を捉えるために,歩行関連言語キューを設計した。モーダルなアライメントを改善するため,様々な意味情報を適応的に調整し,視覚的表現とのアライメントを向上させることで言語特徴を改良するMotion Awareness Module (MAM)を提案する。さらに、歩行特徴の識別能力を高め、モデルの動き追跡能力を向上させるために、MTCM(Motion Temporal Capture Module)を導入する。複数のデータセットにまたがって広範な実験を行い、提案したネットワークの利点を実証した。具体的には、CCPG、SUSTech1K、CASIABデータセットでそれぞれ88.5%、97.1%、97.5%の精度を達成した。ホームページ: https://dingwu1021.github.io/LMGait/

論文の概要: Language-Guided and Motion-Aware Gait Representation for Generalizable Recognition

関連論文リスト