Fugu-MT 論文翻訳(概要): ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning

論文の概要: ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning

arxiv url: http://arxiv.org/abs/2603.04363v1
Date: Wed, 04 Mar 2026 18:29:28 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:41.905339
Title: ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning
Title（参考訳）: ManipulationNet: 物理的なスキルチャレンジとマルチモーダル推論による実世界のロボットマニピュレーションのベンチマークのための基盤
Authors: Yiting Chen, Kenneth Kimble, Edward H. Adelson, Tamim Asfour, Podshara Chanrungmaneekul, Sachin Chitta, Yash Chitambar, Ziyang Chen, Ken Goldberg, Danica Kragic, Hui Li, Xiang Li, Yunzhu Li, Aaron Prather, Nancy Pollard, Maximo A. Roa-Garzon, Robert Seney, Shuo Sha, Shihefeng Wang, Yu Xiang, Kaifeng Zhang, Yuke Zhu, Kaiyu Hang,
Abstract要約: 有害な操作により、ロボットは物理的世界を意図的に変化させ、パッシブな観察者から非構造環境のアクティブなエージェントに変えることができる。ハードウェア、知覚、制御、学習の数十年にわたる進歩にもかかわらず、一般的な操作システムへの進歩は依然として断片化されている。ロボット操作のための実世界のベンチマークタスクをホストするグローバルインフラストラクチャであるManipulationNetを紹介した。
参考スコア（独自算出の注目度）: 61.35327888597012
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dexterous manipulation enables robots to purposefully alter the physical world, transforming them from passive observers into active agents in unstructured environments. This capability is the cornerstone of physical artificial intelligence. Despite decades of advances in hardware, perception, control, and learning, progress toward general manipulation systems remains fragmented due to the absence of widely adopted standard benchmarks. The central challenge lies in reconciling the variability of the real world with the reproducibility and authenticity required for rigorous scientific evaluation. To address this, we introduce ManipulationNet, a global infrastructure that hosts real-world benchmark tasks for robotic manipulation. ManipulationNet delivers reproducible task setups through standardized hardware kits, and enables distributed performance evaluation via a unified software client that delivers real-time task instructions and collects benchmarking results. As a persistent and scalable infrastructure, ManipulationNet organizes benchmark tasks into two complementary tracks: 1) the Physical Skills Track, which evaluates low-level physical interaction skills, and 2) the Embodied Reasoning Track, which tests high-level reasoning and multimodal grounding abilities. This design fosters the systematic growth of an interconnected network of real-world abilities and skills, paving the path toward general robotic manipulation. By enabling comparable manipulation research in the real world at scale, this infrastructure establishes a sustainable foundation for measuring long-term scientific progress and identifying capabilities ready for real-world deployment.
Abstract（参考訳）: 有害な操作により、ロボットは物理的世界を意図的に変化させ、パッシブな観察者から非構造環境のアクティブなエージェントに変えることができる。この能力は、物理的な人工知能の基盤となっている。ハードウェア、知覚、制御、学習の数十年にわたる進歩にもかかわらず、広く採用されている標準ベンチマークが欠如しているため、一般的な操作システムへの進歩は断片化されている。中心的な課題は、厳密な科学的評価に必要な再現性と認証と、現実世界の多様性を調和させることである。この問題を解決するために,ロボット操作のための実世界のベンチマークタスクをホストするグローバルインフラストラクチャであるManipulationNetを紹介した。 ManipulationNetは、標準化されたハードウェアキットを通じて再現可能なタスク設定を提供し、リアルタイムタスク命令を配信し、ベンチマーク結果を収集する統合ソフトウェアクライアントを介して分散パフォーマンス評価を可能にする。永続的でスケーラブルなインフラストラクチャとして、ManipulationNetは、ベンチマークタスクを2つの補完的なトラックにまとめている。 1)低レベルの身体的相互作用能力を評価する身体スキルトラック、及び 2)高レベル推論とマルチモーダルグラウンド機能をテストするEmbodied Reasoning Track。この設計は、現実世界の能力とスキルの相互接続ネットワークの体系的な成長を促進し、一般的なロボット操作への道を歩む。このインフラは、現実世界の大規模に匹敵する操作研究を可能にすることで、長期的な科学的進歩を計測し、現実世界の展開に備えた能力を特定するための持続可能な基盤を確立します。

論文の概要: ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning

関連論文リスト