Fugu-MT 論文翻訳(概要): Robotic Policy Adaptation via Weight-Space Meta-Learning

論文の概要: Robotic Policy Adaptation via Weight-Space Meta-Learning

arxiv url: http://arxiv.org/abs/2606.07217v1
Date: Fri, 05 Jun 2026 12:29:28 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.734336
Title: Robotic Policy Adaptation via Weight-Space Meta-Learning
Title（参考訳）: 軽量空間メタラーニングによるロボット政策適応
Authors: Christian Bianchi, Siamak Yousefi, Alessio Sampieri, Andrea Roberti, Luca Rigazio, Fabio Galasso, Luca Franco,
Abstract要約: WIZARDは、凍結VLAポリシーのためのタスク固有のLoRAパラメータを生成することで、タスク固有の微調整をサイドステップする。メタトレーニング中、WIZARDはタスクエビデンスを直接専門家のLoRAアップデートにマップすることを学ぶ。実験の結果、WIZARDは見えないデータセットのコレクションでは最大2倍、見えないタスクでは最大14倍の性能向上を示している。
参考スコア（独自算出の注目度）: 10.867247751424891
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vision-Language-Action (VLA) models are emerging as a promising paradigm for robotic manipulation, enabling general-purpose policies trained from large corpora of demonstrations and action labels. However, adapting these models to new tasks still typically requires task-specific demonstrations, action annotations, and additional fine-tuning, making deployment costly and difficult to scale. We propose WIZARD, a weight-space meta-learning framework that sidesteps task-specific fine-tuning by generating task-specific LoRA parameters for a frozen VLA policy. Given only a language instruction and a short demonstration video, WIZARD predicts the corresponding adaptation weights in a single forward pass, without target-task action labels or test-time optimization. During meta-training, WIZARD learns to map task evidence directly to expert LoRA updates, capturing relationships between tasks in weight space. Experiments on LIBERO show that WIZARD improves performance by up to ~2x on unseen dataset collections and up to ~14x on unseen tasks. On a Franka Emika Panda, WIZARD consistently improves over a real-domain adapted baseline, showing that generated adapters provide task-level specialization beyond simulation.
Abstract（参考訳）: VLA(Vision-Language-Action)モデルは、ロボット操作のための有望なパラダイムとして登場し、大規模なデモコーパスとアクションラベルからトレーニングされた汎用的なポリシーを可能にしている。しかしながら、これらのモデルを新しいタスクに適用するには、通常、タスク固有のデモ、アクションアノテーション、追加の微調整が必要で、デプロイメントのコストとスケールが難しくなる。凍結VLAポリシーのためのタスク固有のLoRAパラメータを生成することでタスク固有の微調整をサイドステップする、重量空間メタラーニングフレームワークWIZARDを提案する。言語命令と短いデモビデオのみを前提として、WIZARDはターゲットタスクアクションラベルやテストタイム最適化を使わずに、単一の前方パスで対応する適応重みを予測する。メタトレーニング中、WIZARDはタスクエビデンスを直接専門家のLoRA更新にマップし、重み空間内のタスク間の関係をキャプチャする。 LIBEROの実験では、WIZARDは、見えないデータセットのコレクションでは最大2倍、見えないタスクでは最大14倍の性能向上を示している。フランカ・エミカ・パンダでは、WIZARDは実際のドメイン適応ベースラインを一貫して改善し、生成されたアダプタがシミュレーション以上のタスクレベルの特殊化を提供することを示した。

論文の概要: Robotic Policy Adaptation via Weight-Space Meta-Learning

関連論文リスト