Fugu-MT 論文翻訳(概要): Robot Learning from Human Videos: A Survey

論文の概要: Robot Learning from Human Videos: A Survey

arxiv url: http://arxiv.org/abs/2604.27621v1
Date: Thu, 30 Apr 2026 09:11:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-01 16:31:54.016263
Title: Robot Learning from Human Videos: A Survey
Title（参考訳）: 人間のビデオから学ぶロボット:サーベイ
Authors: Junyi Ma, Erhang Zhang, Haoran Yang, Ditao Li, Chenyang Xu, Guangming Wang, Hesheng Wang,
Abstract要約: エンボディされたAIとロボティクスのさらなる進歩を妨げる重要なボトルネックは、ロボットデータのスケーリングである。近年,人間のビデオデータからロボット操作技術を学ぶ分野が急速に注目を集めている。本稿では,ロボット工学におけるヒューマンビデオベースの学習技術に関する総合的なレビューを行う。
参考スコア（独自算出の注目度）: 30.494143344658227
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: A critical bottleneck hindering further advancement in embodied AI and robotics is the challenge of scaling robot data. To address this, the field of learning robot manipulation skills from human video data has attracted rapidly growing attention in recent years, driven by the abundance of human activity videos and advances in computer vision. This line of research promises to enable robots to acquire skills passively from the vast and readily available resource of human demonstrations, substantially favoring scalable learning for generalist robotic systems. Therefore, we present this survey to provide a comprehensive and up-to-date review of human-video-based learning techniques in robotics, focusing on both human-robot skill transfer and data foundations. We first review the policy learning foundations in robotics, and then describe the fundamental interfaces to incorporate human videos. Subsequently, we introduce a hierarchical taxonomy of transferring human videos to robot skills, covering task-, observation-, and action-oriented pathways, along with a cross-family analysis of their couplings with different data configurations and learning paradigms. In addition, we investigate the data foundations including widely-used human video datasets and video generation schemes, and provide large-scale statistical trends in dataset development and utilization. Ultimately, we emphasize the challenges and limitations intrinsic to this field, and delineate potential avenues for future research. The paper list of our survey is available at https://github.com/IRMVLab/awesome-robot-learning-from-human-videos.
Abstract（参考訳）: エンボディされたAIとロボティクスのさらなる進歩を妨げる重要なボトルネックは、ロボットデータのスケーリングである。これを解決するために、人間のビデオデータからロボットの操作技術を学習する分野は、人間の活動ビデオの多さやコンピュータビジョンの進歩によって、近年急速に注目を集めている。この一連の研究は、ロボットが人間のデモの膨大なリソースから受動的にスキルを習得できるようにすることを約束している。そこで本研究では,ロボット工学におけるヒューマンビデオベースの学習技術について,人間ロボットのスキル伝達とデータ基盤の両面に着目し,包括的かつ最新のレビューを行う。まず,ロボット工学における政策学習の基礎を概観し,人間の動画を組み込むための基本的なインターフェースについて述べる。その後、人間の映像をロボットのスキルに移行し、タスク、観察、行動指向の経路をカバーし、異なるデータ構成と学習パラダイムとの結合をクロスファミリー分析する階層的な分類法を紹介した。さらに,広範に利用されているヒトビデオデータセットやビデオ生成手法を含むデータ基盤について検討し,データセット開発と利用における大規模統計トレンドを提供する。究極的には、この分野に固有の課題と限界を強調し、将来の研究の潜在的な道筋を明確にする。調査のペーパーリストはhttps://github.com/IRMVLab/awesome-robot-learning-from-human-videos.comで公開されている。

論文の概要: Robot Learning from Human Videos: A Survey

関連論文リスト