Fugu-MT 論文翻訳(概要): Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

論文の概要: Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

arxiv url: http://arxiv.org/abs/2309.12009v2
Date: Thu, 11 Jan 2024 02:44:27 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-13 03:32:02.824432
Title: Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision
Title（参考訳）: 効率的なマルチモダリティ自己スーパービジョンによるスケルトンベース行動認識
Authors: Yiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen
Abstract要約: 近年,人間の行動認識のための自己指導型表現学習が急速に発展している。既存の作業の多くは、マルチモダリティ設定を使用してスケルトンデータに基づいている。本稿ではまず,低性能モード間の誤った知識の伝播を緩和するインプリシト知識交換モジュールを提案する。
参考スコア（独自算出の注目度）: 40.16465314639641
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised representation learning for human action recognition has developed rapidly in recent years. Most of the existing works are based on skeleton data while using a multi-modality setup. These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i.e., joints, bones, and motions are used, hence no additional modalities are explored. In this work, we first propose an Implicit Knowledge Exchange Module (IKEM) which alleviates the propagation of erroneous knowledge between low-performance modalities. Then, we further propose three new modalities to enrich the complementary information between modalities. Finally, to maintain efficiency when introducing new modalities, we propose a novel teacher-student framework to distill the knowledge from the secondary modalities into the mandatory modalities considering the relationship constrained by anchors, positives, and negatives, named relational cross-modality knowledge distillation. The experimental results demonstrate the effectiveness of our approach, unlocking the efficient use of skeleton-based multi-modality data. Source code will be made publicly available at https://github.com/desehuileng0o0/IKEM.
Abstract（参考訳）: 近年,人間の行動認識のための自己指導型表現学習が急速に発展している。既存の作業の多くは、マルチモダリティ設定を使用してスケルトンデータに基づいている。これらの研究は、モダリティ間のパフォーマンスの違いを見落とし、モダリティ間の誤った知識の伝播につながったが、3つの基本的なモダリティ(関節、骨、運動)しか使われておらず、追加のモダリティは検討されていない。本研究では,まず,低性能なモダリティ間の誤った知識の伝播を緩和するImplicit Knowledge Exchange Module (IKEM)を提案する。さらに,相補的情報を充実させるための3つの新しいモダリティを提案する。最後に, 新たなモダリティ導入時の効率を維持するために, 二次モダリティからの知識を, アンカー, 正, 負の関係を考慮し, 強制モダリティに抽出する新たな教師学習フレームワークを提案する。提案手法の有効性を実証し,スケルトンに基づくマルチモダリティデータの有効利用を実証した。ソースコードはhttps://github.com/desehuileng0o0/IKEMで公開されている。

関連論文リスト

Harmony: A Unified Framework for Modality Incremental Learning [81.13765007314781]
本稿では,連続的に進化するモーダルシーケンスを横断するインクリメンタル学習が可能な統一モデルの実現可能性について検討する。本研究では,適応的アライメントと知識保持を実現するために,Harmonyという新しいフレームワークを提案する。提案手法は適応性のある特徴変調と累積的モーダルブリッジングを導入する。
論文参考訳（メタデータ） (2025-04-17T06:35:01Z)
Knowledge Distillation for Multimodal Egocentric Action Recognition Robust to Missing Modalities [43.15852057358654]
我々は,エゴセントリックな行動認識のための効率的なマルチモーダルな知識蒸留手法を提案する。本手法は,教師モデルにおける一助的特徴抽出器として事前学習したモデルを活用することで,資源効率の向上に重点を置いている。
論文参考訳（メタデータ） (2025-04-11T14:30:42Z)
ReconBoost: Boosting Can Achieve Modality Reconcilement [89.4377895465204]
我々は、調和を達成するために、モダリティ代替学習パラダイムについて研究する。固定モードを毎回更新するReconBoostと呼ばれる新しい手法を提案する。提案手法はFriedman's Gradient-Boosting (GB) アルゴリズムに似ており,更新された学習者が他者による誤りを訂正できることを示す。
論文参考訳（メタデータ） (2024-05-15T13:22:39Z)
Unified Multi-modal Unsupervised Representation Learning for Skeleton-based Action Understanding [62.70450216120704]
教師なしの事前訓練は骨格に基づく行動理解において大きな成功を収めた。我々はUmURLと呼ばれる統一マルチモーダル非教師なし表現学習フレームワークを提案する。 UmURLは効率的な早期融合戦略を利用して、マルチモーダル機能を単一ストリームで共同でエンコードする。
論文参考訳（メタデータ） (2023-11-06T13:56:57Z)
Learning Unseen Modality Interaction [54.23533023883659]
マルチモーダル学習は、すべてのモダリティの組み合わせが訓練中に利用でき、クロスモーダル対応を学ぶことを前提としている。我々は、目に見えないモダリティ相互作用の問題を提起し、第1の解を導入する。異なるモジュラリティの多次元的特徴を、豊富な情報を保存した共通空間に投影するモジュールを利用する。
論文参考訳（メタデータ） (2023-06-22T10:53:10Z)
CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation [130.08432609780374]
3D行動認識では、骨格のモダリティの間に豊富な相補的な情報が存在する。本稿では,CMD(Cross-modal Mutual Distillation)フレームワークを提案する。提案手法は,既存の自己管理手法より優れ,新しい記録を多数設定する。
論文参考訳（メタデータ） (2022-08-26T06:06:09Z)
Contrastive Learning with Cross-Modal Knowledge Mining for Multimodal Human Activity Recognition [1.869225486385596]
複数のモダリティを活用することによって、より良い認識がもたらされるという仮説を探求する。我々は、近年、人間活動認識の課題に対して、多くの対照的な自己監督的アプローチを拡張している。マルチモーダルな自己教師型学習を実現するための,フレキシブルで汎用的なフレームワークを提案する。
論文参考訳（メタデータ） (2022-05-20T10:39:16Z)
On Modality Bias Recognition and Reduction [70.69194431713825]
マルチモーダル分類の文脈におけるモダリティバイアス問題について検討する。本稿では,各ラベルの特徴空間を適応的に学習するプラグアンドプレイ損失関数法を提案する。本手法は, ベースラインに比べ, 顕著な性能向上を実現している。
論文参考訳（メタデータ） (2022-02-25T13:47:09Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。