Fugu-MT 論文翻訳(概要): UniBYD: A Unified Framework for Learning Robotic Manipulation Across Embodiments Beyond Imitation of Human Demonstrations

論文の概要: UniBYD: A Unified Framework for Learning Robotic Manipulation Across Embodiments Beyond Imitation of Human Demonstrations

arxiv url: http://arxiv.org/abs/2512.11609v2
Date: Tue, 10 Mar 2026 15:25:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:40.303274
Title: UniBYD: A Unified Framework for Learning Robotic Manipulation Across Embodiments Beyond Imitation of Human Demonstrations
Title（参考訳）: UniBYD:人間デモの模倣以外の身体のロボット操作を学習するための統一フレームワーク
Authors: Tingyu Yuan, Biaoliang Guan, Wen Ye, Ziyan Tian, Yi Yang, Weijie Zhou, Zhaowen Li, Yan Huang, Peng Wang, Chaoyang Zhao, Jinqiao Wang,
Abstract要約: インボディードインテリジェンスでは、ロボットと人間の手の間のエンボディーメントギャップは、人間のデモンストレーションから学ぶ上で大きな課題をもたらします。動的強化学習アルゴリズムを用いて,ロボットの物理的特性に適合した操作ポリシーを検出する統一フレームワークUniBYDを提案する。 UniBYDを評価するために,多種多様なロボット形態にまたがるクロスボデーメント操作のための最初のベンチマークであるUniManipを提案する。
参考スコア（独自算出の注目度）: 35.77665515297785
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In embodied intelligence, the embodiment gap between robotic and human hands brings significant challenges for learning from human demonstrations. Although some studies have attempted to bridge this gap using reinforcement learning, they remain confined to merely reproducing human manipulation, resulting in limited task performance. Moreover, current methods struggle to support diverse robotic hand configurations. In this paper, we propose UniBYD, a unified framework that uses a dynamic reinforcement learning algorithm to discover manipulation policies aligned with the robot's physical characteristics. To enable consistent modeling across diverse robotic hand morphologies, UniBYD incorporates a unified morphological representation (UMR). Building on UMR, we design a dynamic PPO with an annealed reward schedule, enabling reinforcement learning to transition from offline-informed imitation of human demonstrations to online-adaptive exploration of policies better adapted to diverse robotic morphologies, thereby going beyond mere imitation of human hands. To address the severe state drift caused by the incapacity of early-stage policies, we design a hybrid Markov-based shadow engine that provides fine-grained guidance to anchor the imitation within the expert's manifold. To evaluate UniBYD, we propose UniManip, the first benchmark for cross-embodiment manipulation spanning diverse robotic morphologies. Experiments demonstrate a 44.08% average improvement in success rate over the current state-of-the-art. Upon acceptance, we will release our code and benchmark.
Abstract（参考訳）: インボディードインテリジェンスでは、ロボットと人間の手の間のエンボディーメントギャップは、人間のデモンストレーションから学ぶ上で大きな課題をもたらします。強化学習を用いてこのギャップを埋めようとする研究もあるが、それらは単に人間の操作を再現することに限られており、タスクのパフォーマンスが制限される。さらに、現在の手法は多様なロボットハンドの構成をサポートするのに苦労している。本論文では,動的強化学習アルゴリズムを用いて,ロボットの身体特性に適合した操作ポリシーを検出する統一フレームワークUniBYDを提案する。多様なロボットハンド形態を一貫したモデリングを可能にするため、UniBYDは統一形態素表現(UMR)を取り入れている。 UMRをベースとした動的PPOをアニールした報酬スケジュールで設計し、強化学習により、人間のデモのオフラインでインフォームされた模倣から、多様なロボット形態に適応したポリシーのオンライン適応的な探索へと移行し、人間の手による模倣を超えることができる。初期政策の不完全性に起因する深刻な状態漂流に対処するため,我々はマルコフをベースとしたハイブリッドシャドウエンジンを設計し,専門家の多様体内に模倣を固定するためのきめ細かいガイダンスを提供する。 UniBYDを評価するために,多種多様なロボット形態にまたがるクロスボデーメント操作のための最初のベンチマークであるUniManipを提案する。実験では、現在の最先端よりも44.08%の成功率の平均的な改善が示されている。受け入れられたら、コードとベンチマークをリリースします。

論文の概要: UniBYD: A Unified Framework for Learning Robotic Manipulation Across Embodiments Beyond Imitation of Human Demonstrations

関連論文リスト