Fugu-MT 論文翻訳(概要): Leveraging RGB Images for Pre-Training of Event-Based Hand Pose Estimation

論文の概要: Leveraging RGB Images for Pre-Training of Event-Based Hand Pose Estimation

arxiv url: http://arxiv.org/abs/2509.16949v1
Date: Sun, 21 Sep 2025 07:07:49 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-23 18:58:16.049432
Title: Leveraging RGB Images for Pre-Training of Event-Based Hand Pose Estimation
Title（参考訳）: イベントベースハンドポース推定のためのRGB画像の活用
Authors: Ruicong Liu, Takehiko Ohkawa, Tze Ho Elden Tse, Mingfang Zhang, Angela Yao, Yoichi Sato,
Abstract要約: RPEPはラベル付きRGB画像と未ラベルのイベントデータを用いたイベントベースの3次元ポーズ推定のための最初の事前学習手法である。 EvRealHandsの24%の改善を達成し、実イベントデータにおける最先端の手法を著しく上回る結果となった。
参考スコア（独自算出の注目度）: 64.8814078041756
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper presents RPEP, the first pre-training method for event-based 3D hand pose estimation using labeled RGB images and unpaired, unlabeled event data. Event data offer significant benefits such as high temporal resolution and low latency, but their application to hand pose estimation is still limited by the scarcity of labeled training data. To address this, we repurpose real RGB datasets to train event-based estimators. This is done by constructing pseudo-event-RGB pairs, where event data is generated and aligned with the ground-truth poses of RGB images. Unfortunately, existing pseudo-event generation techniques assume stationary objects, thus struggling to handle non-stationary, dynamically moving hands. To overcome this, RPEP introduces a novel generation strategy that decomposes hand movements into smaller, step-by-step motions. This decomposition allows our method to capture temporal changes in articulation, constructing more realistic event data for a moving hand. Additionally, RPEP imposes a motion reversal constraint, regularizing event generation using reversed motion. Extensive experiments show that our pre-trained model significantly outperforms state-of-the-art methods on real event data, achieving up to 24% improvement on EvRealHands. Moreover, it delivers strong performance with minimal labeled samples for fine-tuning, making it well-suited for practical deployment.
Abstract（参考訳）: 本稿では,ラベル付きRGB画像と未ラベルのイベントデータを用いたイベントベース3Dハンドポーズ推定のための,最初の事前学習手法であるRPEPを提案する。イベントデータは、高時間分解能や低レイテンシなどの大きなメリットを提供するが、ラベル付きトレーニングデータの不足により、手動ポーズ推定のアプリケーションは依然として制限されている。これを解決するために、実際のRGBデータセットを再利用して、イベントベースの推定器をトレーニングします。これは、イベントデータが生成され、RGB画像のグランドトルースポーズと整合する擬似イベント-RGBペアを構築することで実現される。残念ながら、既存の擬似イベント生成技術は静止物体を前提としており、非定常で動的に動く手を扱うのに苦労している。これを解決するために、RPEPは手の動きを小さなステップバイステップの動作に分解する新しい生成戦略を導入した。この分解により,動作する手のためのより現実的なイベントデータを構築することで,音声の時間的変化を捉えることができる。さらに、RPEPは動き反転制約を課し、逆動きを用いたイベント生成を規則化する。 EvRealHandsで最大24%の改善を達成し、実イベントデータ上で、事前学習したモデルが最先端の手法を著しく上回ることを示す。さらに、最小限のラベル付きサンプルを使用して、微調整で強力なパフォーマンスを提供し、実用的なデプロイメントに適している。

論文の概要: Leveraging RGB Images for Pre-Training of Event-Based Hand Pose Estimation

関連論文リスト