Fugu-MT 論文翻訳(概要): VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation

論文の概要: VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation

arxiv url: http://arxiv.org/abs/2605.29564v1
Date: Thu, 28 May 2026 08:15:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-30 02:45:56.056277
Title: VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation
Title（参考訳）: VE2VF:ロバストコンタクトリッチマニピュレーションのための実世界強化学習によるビジョンフリー蒸留
Authors: Victor Kowalski, Chengxi Li, Dongheui Lee,
Abstract要約: ビジョン対応のポリシーは、トレーニング中に見られる視覚条件に過度に適合する傾向があり、その堅牢性と伝達性を制限する。本稿では, 教師学生の蒸留技術を用いて, 頑健な性能を実現する, ループ内RLフレームワークを提案する。視覚能力のある教師は、その知識を、ポーズ、ツイスト、レンチセンシングのみに依存する、視覚のない学生に蒸留する。
参考スコア（独自算出の注目度）: 7.817721592278407
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When using reinforcement learning (RL) for contact-rich robotic manipulation, vision can provide task-relevant information that accelerates learning beyond what proprioception alone can achieve. However, vision-enabled policies tend to overfit to the visual conditions seen during training, limiting their robustness and transferability. We present a human-in-the-loop RL framework that employs teacher-student distillation to achieve robust performance across multiple task variants, trained entirely in the real world without requiring domain randomization or data augmentation. A vision-enabled teacher distills its knowledge into a vision-free student that relies solely on pose, twist, and wrench sensing, combining fast training with strong task generalization. On the real-world NIST assembly benchmark board, our approach achieves 95\% overall success after approximately 50 minutes of training on 3 representative tasks, including robust generalization to 8 unseen task variants. Fine-tuning with distillation achieves full success on the most challenging task. We demonstrate that the resulting policies outperform baselines in both robustness and adaptability.
Abstract（参考訳）: コンタクトリッチなロボット操作に強化学習(RL)を使用する場合、視覚は、プロレトロセプションだけで達成できる以上の学習を加速するタスク関連情報を提供することができる。しかしながら、視覚対応のポリシーは、トレーニング中に見られる視覚条件に過度に適合し、その堅牢性と伝達性を制限する傾向にある。本稿では,教師と学生による蒸留を用いて,複数のタスク変異体に対して堅牢な性能を達成し,ドメインのランダム化やデータ拡張を必要とせず,実世界で完全に訓練されたループ内RLフレームワークを提案する。視覚能力のある教師は、その知識を、ポーズ、ツイスト、レンチセンシングにのみ依存する視覚のない学生に蒸留し、速いトレーニングと強いタスクの一般化を組み合わせます。実世界のNISTアセンブリベンチマークボード上では,3つのタスクに対して約50分間のトレーニングを行った結果,95%の成果が得られた。蒸留による微調整は、最も困難な課題において完全な成功を収める。結果として得られたポリシーは、堅牢性と適応性の両方においてベースラインを上回っていることを実証する。

論文の概要: VE2VF: Vision-Enabled to Vision-Free Distillation via Real-world Reinforcement Learning for Robust Contact-Rich Manipulation

関連論文リスト