Fugu-MT 論文翻訳(概要): Mesh-Gait: A Unified Framework for Gait Recognition Through Multi-Modal Representation Learning from 2D Silhouettes

論文の概要: Mesh-Gait: A Unified Framework for Gait Recognition Through Multi-Modal Representation Learning from 2D Silhouettes

arxiv url: http://arxiv.org/abs/2510.10406v1
Date: Sun, 12 Oct 2025 01:49:05 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-14 18:06:29.923465
Title: Mesh-Gait: A Unified Framework for Gait Recognition Through Multi-Modal Representation Learning from 2D Silhouettes
Title（参考訳）: Mesh-Gait:2Dシルエットからのマルチモーダル表現学習による歩行認識のための統一フレームワーク
Authors: Zhao-Yang Wang, Jieneng Chen, Jiang Liu, Yuxiang Guo, Rama Chellappa,
Abstract要約: 我々は、新しいエンドツーエンドの歩行認識フレームワークであるMesh-Gaitを紹介する。 2Dシルエットから3D表現を直接再構成する。 Mesh-Gaitは最先端の精度を実現する。
参考スコア（独自算出の注目度）: 36.964703204465664
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gait recognition, a fundamental biometric technology, leverages unique walking patterns for individual identification, typically using 2D representations such as silhouettes or skeletons. However, these methods often struggle with viewpoint variations, occlusions, and noise. Multi-modal approaches that incorporate 3D body shape information offer improved robustness but are computationally expensive, limiting their feasibility for real-time applications. To address these challenges, we introduce Mesh-Gait, a novel end-to-end multi-modal gait recognition framework that directly reconstructs 3D representations from 2D silhouettes, effectively combining the strengths of both modalities. Compared to existing methods, directly learning 3D features from 3D joints or meshes is complex and difficult to fuse with silhouette-based gait features. To overcome this, Mesh-Gait reconstructs 3D heatmaps as an intermediate representation, enabling the model to effectively capture 3D geometric information while maintaining simplicity and computational efficiency. During training, the intermediate 3D heatmaps are gradually reconstructed and become increasingly accurate under supervised learning, where the loss is calculated between the reconstructed 3D joints, virtual markers, and 3D meshes and their corresponding ground truth, ensuring precise spatial alignment and consistent 3D structure. Mesh-Gait extracts discriminative features from both silhouettes and reconstructed 3D heatmaps in a computationally efficient manner. This design enables the model to capture spatial and structural gait characteristics while avoiding the heavy overhead of direct 3D reconstruction from RGB videos, allowing the network to focus on motion dynamics rather than irrelevant visual details. Extensive experiments demonstrate that Mesh-Gait achieves state-of-the-art accuracy. The code will be released upon acceptance of the paper.
Abstract（参考訳）: 歩行認識は基本的な生体計測技術であり、シルエットや骨格のような2D表現を用いて、個々の識別にユニークな歩行パターンを利用する。しかし、これらの手法はしばしば視点のバリエーション、閉塞、ノイズに悩まされる。 3次元体の形状情報を組み込んだマルチモーダルアプローチは、堅牢性の向上を提供するが、計算コストが高く、リアルタイムアプリケーションへの実現可能性を制限する。これらの課題に対処するために,2次元シルエットから直接3次元表現を再構成し,両モードの強みを効果的に組み合わせた,新しいエンドツーエンドマルチモーダル歩行認識フレームワークであるMesh-Gaitを導入する。既存の手法と比較して、3Dジョイントやメッシュから直接3D特徴を学習することは複雑で、シルエットベースの歩行特徴と融合するのは難しい。これを解決するため、Mesh-Gaitは3Dヒートマップを中間表現として再構築し、単純さと計算効率を保ちながら、3Dの幾何学的情報を効果的に捉えることができる。トレーニング中、中間の3Dヒートマップは徐々に再構築され、教師あり学習において精度が向上し、再構成された3D関節、仮想マーカー、および3Dメッシュとその対応する接地真実の間の損失が計算され、正確な空間的アライメントと一貫した3D構造が確保される。 Mesh-Gaitは、シルエットと再構成された3Dヒートマップの両方から、計算的に効率的に識別的特徴を抽出する。この設計により、RGBビデオからの直接3D再構成のオーバーヘッドを回避しつつ、空間的および構造的歩行特性を捉えることができ、ネットワークは無関係な視覚的詳細ではなく、運動力学に集中することができる。大規模な実験は、Mesh-Gaitが最先端の精度を達成することを示した。コードは論文の受理時に公開される。

論文の概要: Mesh-Gait: A Unified Framework for Gait Recognition Through Multi-Modal Representation Learning from 2D Silhouettes

関連論文リスト