Fugu-MT 論文翻訳(概要): Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance

論文の概要: Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance

arxiv url: http://arxiv.org/abs/2603.25661v2
Date: Fri, 27 Mar 2026 11:46:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-30 21:49:48.17791
Title: Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance
Title（参考訳）: Fast-dVLA:離散拡散VLAのリアルタイム性能向上
Authors: Wenxuan Song, Jiayi Chen, Shuai Chen, Jingbo Wang, Pengxiang Ding, Han Zhao, Yikai Qin, Xinhu Zheng, Donglin Wang, Yan Wang, Haoang Li,
Abstract要約: 補助的な訓練対象を持つ高度な微調整法は、性能を改善し、収束ステップの数を減らすことができる。本稿では,事前学習したVLAモデルが,標準的な教師付き微調整における性能向上や適応コストの低減に失敗するケースに対して,新しいアプローチを提案する。
参考スコア（独自算出の注目度）: 47.605498477489306
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a novel approach to address the challenge that pretrained VLA models often fail to effectively improve performance and reduce adaptation costs during standard supervised finetuning (SFT). Some advanced finetuning methods with auxiliary training objectives can improve performance and reduce the number of convergence steps. However, they typically incur significant computational overhead due to the additional losses from auxiliary tasks. To simultaneously achieve the enhanced capabilities of auxiliary training with the simplicity of standard SFT, we decouple the two objectives of auxiliary task training within the parameter space, namely, enhancing general capabilities and fitting task-specific action distributions. To deliver this goal, we only need to train the model to converge on a small-scale task set using two distinct training strategies. The difference between the resulting model parameters can then be interpreted as capability vectors provided by auxiliary tasks. These vectors are then merged with pretrained parameters to form a capability-enhanced meta model. Moreover, when standard SFT is augmented with a lightweight orthogonal regularization loss, the merged model attains performance comparable to auxiliary finetuned baselines with reduced computational overhead. Experimental results demonstrate that this approach is highly effective across diverse robot tasks. Project page: https://chris1220313648.github.io/Fast-dVLA/
Abstract（参考訳）: 本稿では,事前学習したVLAモデルが,標準教師ありファインタニング(SFT)における性能向上や適応コストの低減に失敗するケースに対して,新たなアプローチを提案する。補助的な訓練対象を持つ高度な微調整法は、性能を改善し、収束ステップの数を減らすことができる。しかし、通常、補助的なタスクによる追加的な損失のため、計算オーバーヘッドが大幅に増大する。標準SFTの簡易化により補助訓練の能力向上を同時に達成するため,パラメータ空間内での補助訓練の2つの目的,すなわち汎用能力の向上とタスク固有の行動分布の適合を分離する。この目標を達成するためには、2つの異なるトレーニング戦略を使用して、小さなタスクセットに収束するようにモデルをトレーニングするだけです。得られたモデルパラメータの違いは、補助的なタスクによって提供される機能ベクトルとして解釈できる。これらのベクトルは事前訓練されたパラメータとマージされ、機能強化メタモデルを形成する。さらに、標準SFTを軽量な直交正規化損失で拡張した場合、マージモデルは、計算オーバーヘッドを低減した補助的な微調整ベースラインに匹敵する性能を得る。実験結果から,本手法は多種多様なロボット作業において極めて効果的であることが示された。プロジェクトページ: https://chris1220313648.github.io/Fast-dVLA/

論文の概要: Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance

関連論文リスト