Fugu-MT 論文翻訳(概要): One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

論文の概要: One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

arxiv url: http://arxiv.org/abs/2512.09297v2
Date: Sun, 01 Feb 2026 11:23:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:40.248105
Title: One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation
Title（参考訳）: スケーラブルなバイマンダルマニピュレーションのためのワンショット実世界デモレーション合成
Authors: Huayi Zhou, Kui Jia,
Abstract要約: BiDemoSynは1つの実世界の例から、接触に富んだ物理的に実現可能なバイマダルなデモンストレーションを合成するフレームワークである。 BiDemoSynデータに基づいてトレーニングされたポリシーは、新しいオブジェクトのポーズや形状に対して堅牢に一般化されていることを示す。 BiDemoSynのデータに基づいてトレーニングされたポリシーは、ゼロショットのクロスボデーメントを新しいロボットプラットフォームに転送する。
参考スコア（独自算出の注目度）: 45.00986521352502
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning dexterous bimanual manipulation policies critically depends on large-scale, high-quality demonstrations, yet current paradigms face inherent trade-offs: teleoperation provides physically grounded data but is prohibitively labor-intensive, while simulation-based synthesis scales efficiently but suffers from sim-to-real gaps. We present BiDemoSyn, a framework that synthesizes contact-rich, physically feasible bimanual demonstrations from a single real-world example. The key idea is to decompose tasks into invariant coordination blocks and variable, object-dependent adjustments, then adapt them through vision-guided alignment and lightweight trajectory optimization. This enables the generation of thousands of diverse and feasible demonstrations within several hour, without repeated teleoperation or reliance on imperfect simulation. Across six dual-arm tasks, we show that policies trained on BiDemoSyn data generalize robustly to novel object poses and shapes, significantly outperforming recent strong baselines. Beyond the one-shot setting, BiDemoSyn naturally extends to few-shot-based synthesis, improving object-level diversity and out-of-distribution generalization while maintaining strong data efficiency. Moreover, policies trained on BiDemoSyn data exhibit zero-shot cross-embodiment transfer to new robotic platforms, enabled by object-centric observations and a simplified 6-DoF end-effector action representation that decouples policies from embodiment-specific dynamics. By bridging the gap between efficiency and real-world fidelity, BiDemoSyn provides a scalable path toward practical imitation learning for complex bimanual manipulation without compromising physical grounding.
Abstract（参考訳）: 遠隔操作は物理的に根拠のあるデータを提供するが、労働集約的であり、シミュレーションベースの合成は効率的にスケールするが、シミュレーションから現実のギャップに苦しむ。実世界の1つの実例から、接触に富んだ物理的に実現可能なバイマンデモを合成するフレームワークであるBiDemoSynを提案する。鍵となる考え方は、タスクを不変の調整ブロックと可変なオブジェクト依存の調整に分解し、視覚誘導アライメントと軽量な軌道最適化によってそれらを適応させることである。これにより、遠隔操作を繰り返したり、不完全なシミュレーションに依存することなく、数時間以内に何千もの多様かつ実現可能なデモを生成できる。 6つのデュアルアームタスクにわたって、BiDemoSynデータに基づいてトレーニングされたポリシーは、新しいオブジェクトのポーズや形状に頑健に一般化され、最近の強いベースラインよりも著しく優れていることを示す。ワンショット設定以外にも、BiDemoSynは自然に少数ショットベースの合成に拡張し、強力なデータ効率を維持しながら、オブジェクトレベルの多様性とアウト・オブ・ディストリビューションの一般化を改善している。さらに、BiDemoSynデータに基づいてトレーニングされたポリシーは、オブジェクト中心の観察と、エボデーメント固有のダイナミクスからポリシーを分離する6-DoFエンドエフェクタアクション表現によって、新しいロボットプラットフォームへのゼロショットのクロスボデーメント転送を示す。 BiDemoSynは、効率性と現実世界の忠実さのギャップを埋めることによって、物理的な接地を損なうことなく、複雑な双方向操作のための実践的な模倣学習へのスケーラブルなパスを提供する。

論文の概要: One-Shot Real-World Demonstration Synthesis for Scalable Bimanual Manipulation

関連論文リスト