Fugu-MT 論文翻訳(概要): One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

論文の概要: One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

arxiv url: http://arxiv.org/abs/2603.14522v1
Date: Sun, 15 Mar 2026 17:59:08 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-21 18:33:56.830691
Title: One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation
Title（参考訳）: 1-Policy-Fits-All: クロス・エボディメント・マニピュレーションのための幾何学的行動遅延剤
Authors: Juncheng Mu, Sizhe Yang, Hojin Bae, Feiyu Jia, Qingwei Ben, Boyi Li, Huazhe Xu, Jiangmiao Pang,
Abstract要約: ロボット操作のスケーラビリティ向上には,クロス・エボディメント操作が不可欠である。我々は,複数の実施形態をまたいだ多目的政策を学習可能なフレームワークであるOneFits-All(OPFA)を提案する。
参考スコア（独自算出の注目度）: 51.66470249744105
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cross-embodiment manipulation is crucial for enhancing the scalability of robot manipulation and reducing the high cost of data collection. However, the significant differences between embodiments, such as variations in action spaces and structural disparities, pose challenges for joint training across multiple sources of data. To address this, we propose One-Policy-Fits-All (OPFA), a framework that enables learning a single, versatile policy across multiple embodiments. We first learn a Geometry-Aware Latent Representation (GaLR), which leverages 3D convolution networks and transformers to build a shared latent action space across different embodiments. Then we design a unified latent retargeting decoder that extracts embodiment-specific actions from the latent representations, without any embodiment-specific decoder tuning. OPFA enables end-to-end co-training of data from diverse embodiments, including various grippers and dexterous hands with arbitrary degrees of freedom, significantly improving data efficiency and reducing the cost of skill transfer. We conduct extensive experiments across 11 different end-effectors. The results demonstrate that OPFA significantly improves policy performance in diverse settings by leveraging heterogeneous embodiment data. For instance, cross-embodiment co-training can improve success rates by more than 50% compared to single-source training. Moreover, by adding only a few demonstrations from a new embodiment (e.g., eight), OPFA can achieve performance comparable to that of a well-trained model with 72 demonstrations.
Abstract（参考訳）: ロボット操作のスケーラビリティを高め、データ収集のコストを下げるためには、クロス・エボディメント操作が不可欠である。しかし,動作空間の変動や構造的差異などの実施形態の違いは,複数のデータソースをまたいだ共同トレーニングの課題を提起する。この問題に対処するため,我々は,複数の実施形態にまたがる単一多目的政策を学習可能なフレームワークであるOne-Policy-Fits-All (OPFA)を提案する。我々はまず,3次元畳み込みネットワークとトランスフォーマーを活用して,異なる実施形態にまたがる共用潜在動作空間を構築するGaLR(Geometry-Aware Latent Representation)を学習する。そこで我々は、エンボディメント固有のデコーダチューニングを使わずに、エンボディメント固有の動作を潜在表現から抽出する統合ラテント再ターゲットデコーダを設計する。 OPFAは、さまざまなグリップや器用な手など、さまざまな実施形態からのデータのエンドツーエンドのコトレーニングを任意の自由度で実現し、データ効率を大幅に改善し、スキル移行のコストを低減します。 11種類のエンドエフェクターにまたがって広範な実験を行っている。その結果,OPFAは不均一な実施データを活用することにより,多種多様な環境下での政策性能を著しく向上することが示された。例えば、クロスエボディメントのコトレーニングは、シングルソースのトレーニングに比べて50%以上成功率を向上させることができる。さらに、新しいエボディメント(例:8)からいくつかのデモを追加することで、OPFAは72のデモを持つよく訓練されたモデルに匹敵するパフォーマンスを達成することができる。

論文の概要: One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

関連論文リスト