Fugu-MT 論文翻訳(概要): Building Explicit World Model for Zero-Shot Open-World Object Manipulation

論文の概要: Building Explicit World Model for Zero-Shot Open-World Object Manipulation

arxiv url: http://arxiv.org/abs/2603.13825v1
Date: Sat, 14 Mar 2026 08:13:32 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-21 18:33:56.790768
Title: Building Explicit World Model for Zero-Shot Open-World Object Manipulation
Title（参考訳）: ゼロショットオープンワールドオブジェクト操作のための明示的世界モデルの構築
Authors: Xiaotong Li, Gang Chen, Javier Alonso-Mora,
Abstract要約: オープンワールド操作のための明示的世界モデルベースのフレームワークを提案する。このフレームワークは、オープンセットの認識、デジタル双対再構築、インタラクション戦略のサンプリングと評価を統合している。提案するフレームワークは,タスク固有の動作デモを伴わずに,複数のオープンセット操作タスクを実行できる。
参考スコア（独自算出の注目度）: 30.004607772330473
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Open-world object manipulation remains a fundamental challenge in robotics. While Vision-Language-Action (VLA) models have demonstrated promising results, they rely heavily on large-scale robot action demonstrations, which are costly to collect and can hinder out-of-distribution generalization. In this paper, we propose an explicit-world-model-based framework for open-world manipulation that achieves zero-shot generalization by constructing a physically grounded digital twin of the environment. The framework integrates open-set perception, digital-twin reconstruction, sampling and evaluation of interaction strategies. By constructing a digital twin of the environment, our approach efficiently explores and evaluates manipulation strategies in physic-enabled simulator and reliably deploys the chosen strategy to the real world. Experimentally, the proposed framework is able to perform multiple open-set manipulation tasks without any task-specific action demonstrations, proving strong zero-shot generalization on both the task and object levels. Project Page: https://bojack-bj.github.io/projects/thesis/
Abstract（参考訳）: オープンワールドのオブジェクト操作は、ロボティクスにおける根本的な課題である。 Vision-Language-Action(VLA)モデルは有望な結果を示しているが、それらは大規模なロボットアクションのデモに大きく依存している。本稿では,環境の物理的に接地したディジタル双対を構築し,ゼロショットの一般化を実現するオープンワールド操作のための明示的世界モデルベースのフレームワークを提案する。このフレームワークは、オープンセットの認識、デジタル双対再構築、インタラクション戦略のサンプリングと評価を統合している。環境のディジタル双対を構築することにより,物理対応シミュレータにおける操作戦略を効率的に探索し,評価し,選択した戦略を現実世界に確実に展開する。提案したフレームワークは,タスク固有の動作デモを一切行わずに複数のオープンセット操作タスクを実行することができ,タスクレベルとオブジェクトレベルの両方において強力なゼロショット一般化を実現することができる。 Project Page: https://bojack-bj.github.io/ projects/thesis/

論文の概要: Building Explicit World Model for Zero-Shot Open-World Object Manipulation

関連論文リスト