Fugu-MT 論文翻訳(概要): TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation

論文の概要: TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation

arxiv url: http://arxiv.org/abs/2602.09023v3
Date: Thu, 19 Mar 2026 06:49:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:41.298395
Title: TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation
Title（参考訳）: TwinRL-VLA:実世界のロボットマニピュレーションのためのデジタルツイン駆動強化学習
Authors: Qinwen Xu, Jiaming Liu, Rui Zhou, Shaojun Shi, Nuowei Han, Zhuoyang Liu, Chenyang Gu, Shuo Gu, Yang Yue, Gao Huang, Wenzhao Zheng, Sirui Han, Peng Jia, Shanghang Zhang,
Abstract要約: 本稿では,VLAモデルのスケールと探索のガイドを目的としたデジタルツインワールド協調RLフレームワークであるTwinRLを提案する。まず、高忠実度デジタルツインをスマートフォンで捉えたシーンから効率的に再構成し、実環境とシミュレートされた環境間の現実的な双方向転送を可能にする。我々の実験では、TwinRLは、実世界の実証と流通域の両方でカバーされた流通域において100%の成功に近づき、従来の実世界のRL法よりも少なくとも30%のスピードアップを実現している。
参考スコア（独自算出の注目度）: 65.45588646626426
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite strong generalization capabilities, Vision-Language-Action (VLA) models remain constrained by the high cost of expert demonstrations and insufficient real-world interaction. While online reinforcement learning (RL) has shown promise in improving general foundation models, applying RL to VLA manipulation in real-world settings is still hindered by low exploration efficiency and a restricted exploration space. Through systematic real-world experiments, we observe that the effective exploration space of online RL is closely tied to the data distribution of supervised fine-tuning (SFT). Motivated by this observation, we propose TwinRL, a digital twin-real-world collaborative RL framework designed to scale and guide exploration for VLA models. First, a high-fidelity digital twin is efficiently reconstructed from smartphone-captured scenes, enabling realistic bidirectional transfer between real and simulated environments. During the SFT warm-up stage, we introduce an exploration space expansion strategy using digital twins to broaden the support of the data trajectory distribution. Building on this enhanced initialization, we propose a sim-to-real guided exploration strategy to further accelerate online RL. Specifically, TwinRL performs efficient and parallel online RL in the digital twin prior to deployment, effectively bridging the gap between offline and online training stages. Subsequently, we exploit efficient digital twin sampling to identify failure-prone yet informative configurations, which are used to guide targeted human-in-the-loop rollouts on the real robot. In our experiments, TwinRL approaches 100% success in both in-distribution regions covered by real-world demonstrations and out-of-distribution regions, delivering at least a 30% speedup over prior real-world RL methods and requiring only about 20 minutes on average across four tasks.
Abstract（参考訳）: 強力な一般化能力にもかかわらず、Vision-Language-Action(VLA)モデルは、高コストな専門家によるデモンストレーションと不十分な現実世界の相互作用によって制約を受け続けている。オンライン強化学習(RL)は、基礎モデルの改善に期待されているが、実際の環境でのVLA操作にRLを適用することは、探索効率の低下と探査スペースの制限によって依然として妨げられている。実世界の系統的な実験を通して、オンラインRLの効果的な探索空間は、教師付き微調整(SFT)のデータ分布と密接に関連していることが観察された。本研究の目的は,VLAモデルのスケールと探索のガイドを目的としたデジタルツインワールド協調RLフレームワークであるTwinRLを提案することである。まず、高忠実度デジタルツインをスマートフォンで捉えたシーンから効率的に再構成し、実環境とシミュレートされた環境間の現実的な双方向転送を可能にする。 SFTウォームアップの段階では、デジタルツインを用いた探索空間拡張戦略を導入し、データ軌跡分布の支持を広げる。この拡張された初期化に基づいて、オンラインRLをさらに加速するsim-to-realの探索戦略を提案する。具体的には、TwinRLはデプロイ前にディジタルツインで効率的で並列なオンラインRLを実行し、オフラインとオンラインのトレーニングステージ間のギャップを効果的に埋める。そこで,本研究では,実際のロボット上でのループ内ロールアウトの誘導に使用される,障害が発生しやすいが情報的構成を特定するために,効率的なディジタルツインサンプリングを利用する。我々の実験では、TwinRLは実世界の実証と流通域の両方でカバーされた流通域で100%の成功に近づき、従来の実世界のRL法よりも30%以上のスピードアップを実現し、4つのタスクで平均20分程度しか必要としない。

論文の概要: TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation

関連論文リスト