Fugu-MT 論文翻訳(概要): SportSkills: Physical Skill Learning from Sports Instructional Videos

論文の概要: SportSkills: Physical Skill Learning from Sports Instructional Videos

arxiv url: http://arxiv.org/abs/2603.25163v1
Date: Thu, 26 Mar 2026 08:29:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-27 20:52:48.182831
Title: SportSkills: Physical Skill Learning from Sports Instructional Videos
Title（参考訳）: SportSkills:スポーツインストラクショナルビデオによる物理スキルの学習
Authors: Kumar Ashutosh, Chi Hsuan Wu, Kristen Grauman,
Abstract要約: SportSkillsは、Wildビデオによる物理的なスキル学習を目的とした、初めての大規模なスポーツデータセットだ。 SportSkillsは、物理的なアクションの微妙な違いを理解することができる。本稿では,誤り条件付き指導ビデオ検索の大規模タスク定式化について紹介する。
参考スコア（独自算出の注目度）: 51.16409727318035
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current large-scale video datasets focus on general human activity, but lack depth of coverage on fine-grained activities needed to address physical skill learning. We introduce SportSkills, the first large-scale sports dataset geared towards physical skill learning with in-the-wild video. SportSkills has more than 360k instructional videos containing more than 630k visual demonstrations paired with instructional narrations explaining the know-how behind the actions from 55 varied sports. Through a suite of experiments, we show that SportSkills unlocks the ability to understand fine-grained differences between physical actions. Our representation achieves gains of up to 4x with the same model trained on traditional activity-centric datasets. Crucially, building on SportSkills, we introduce the first large-scale task formulation of mistake-conditioned instructional video retrieval, bridging representation learning and actionable feedback generation (e.g., "here's my execution of a skill; which video clip should I watch to improve it?"). Formal evaluations by professional coaches show our retrieval approach significantly advances the ability of video models to personalize visual instructions for a user query.
Abstract（参考訳）: 現在の大規模なビデオデータセットは、一般的な人間の活動に焦点をあてているが、物理的なスキル学習に対処するために必要な、きめ細かな活動について、詳細な情報がない。われわれはSportsSkillsを紹介した。SportsSkillsは、Wildビデオによる物理的なスキル学習を目的とした、最初の大規模スポーツデータセットだ。 SportSkillsには、55種類のスポーツのノウハウを説明する指導的なナレーションと組み合わせて、630万以上の視覚的なデモンストレーションを含む360k以上の指導ビデオがある。一連の実験を通して、SportSkillsは物理的なアクションの微妙な違いを理解する能力を解き放ちます。我々の表現は、従来のアクティビティ中心のデータセットでトレーニングされた同じモデルで最大4倍のゲインを達成する。重要なことは、SportSkills上に構築され、ミス条件付き指導ビデオ検索、ブリッジング表現学習、アクション可能なフィードバック生成という、最初の大規模なタスク定式化を導入することである(例:「スキルの実行は、どのビデオクリップを、改善するために見るべきなのか?」)。プロのコーチによる形式的評価は、検索アプローチがユーザクエリの視覚的指示をパーソナライズするビデオモデルの能力を大幅に向上することを示している。

論文の概要: SportSkills: Physical Skill Learning from Sports Instructional Videos

関連論文リスト