Fugu-MT 論文翻訳(概要): EVE: A Generator-Verifier System for Generative Policies

論文の概要: EVE: A Generator-Verifier System for Generative Policies

arxiv url: http://arxiv.org/abs/2512.21430v1
Date: Wed, 24 Dec 2025 21:36:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:40.515059
Title: EVE: A Generator-Verifier System for Generative Policies
Title（参考訳）: EVE:ジェネレーティブ・ポリシーのためのジェネレータ検証システム
Authors: Yusuf Ali, Gryphon Patlin, Karthik Kothuri, Muhammad Zubair Irshad, Wuwei Liang, Zsolt Kira,
Abstract要約: 生成的アーキテクチャに基づくビジュモータポリシーは、強い性能を示すが、分散シフトの下では劣化する。 Eveはモジュール型ジェネレータと検証器のインタラクションフレームワークで、テスト時に事前訓練された生成ポリシーのパフォーマンスを高める。
参考スコア（独自算出の注目度）: 27.92559083553638
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visuomotor policies based on generative architectures such as diffusion and flow-based matching have shown strong performance but degrade under distribution shifts, demonstrating limited recovery capabilities without costly finetuning. In the language modeling domain, test-time compute scaling has revolutionized reasoning capabilities of modern LLMs by leveraging additional inference-time compute for candidate solution refinement. These methods typically leverage foundation models as verification modules in a zero-shot manner to synthesize improved candidate solutions. In this work, we hypothesize that generative policies can similarly benefit from additional inference-time compute that employs zero-shot VLM-based verifiers. A systematic analysis of improving policy performance through the generation-verification framework remains relatively underexplored in the current literature. To this end, we introduce EVE - a modular, generator-verifier interaction framework - that boosts the performance of pretrained generative policies at test time, with no additional training. EVE wraps a frozen base policy with multiple zero-shot, VLM-based verifier agents. Each verifier proposes action refinements to the base policy candidate actions, while an action incorporator fuses the aggregated verifier output into the base policy action prediction to produce the final executed action. We study design choices for generator-verifier information interfacing across a system of verifiers with distinct capabilities. Across a diverse suite of manipulation tasks, EVE consistently improves task success rates without any additional policy training. Through extensive ablations, we isolate the contribution of verifier capabilities and action incorporator strategies, offering practical guidelines to build scalable, modular generator-verifier systems for embodied control.
Abstract（参考訳）: 拡散やフローベースマッチングといった生成的アーキテクチャに基づくビジュモータポリシは, 分散シフトによって性能が低下する一方で, コストのかかる微調整を伴わずに, 限られた回復能力を示す。言語モデリング分野において、テスト時間計算のスケーリングは、予測時間計算を候補解の洗練に活用することで、現代のLLMの推論能力に革命をもたらした。これらの手法は通常、改良された候補解を合成するためにゼロショット方式で検証モジュールとして基礎モデルを利用する。本研究では、ゼロショットVLMベースの検証器を用いた推論時間計算により、生成ポリシーが同様に恩恵を受けることができると仮定する。ジェネレーション検証フレームワークによる政策改善の体系的分析は、現在の文献では比較的過小評価されている。この目的のために、モジュール式でジェネレータと検証可能なインタラクションフレームワークであるEVEを導入し、テスト時に事前トレーニングされた生成ポリシーのパフォーマンスを、追加のトレーニングなしで向上させる。 EVEは、凍結したベースポリシーを複数のゼロショット、VLMベースの検証エージェントでラップする。各検証器は、基本方針候補動作に対して動作改善を提案し、作用インセンタは、集約された検証器出力を基本方針行動予測に融合させ、最終的な実行動作を生成する。本研究では,異なる機能を持つ検証器システムに面したジェネレータ検証情報の設計選択について検討する。多様な操作タスクのスイート全体で、EVEは、追加のポリシートレーニングなしでタスクの成功率を継続的に改善する。本研究では,検証機能とアクション・インコーポレータ・ストラテジーのコントリビューションを分離し,拡張性のあるモジュール型ジェネレータ検証システムを構築するための実践的ガイドラインを提供する。

論文の概要: EVE: A Generator-Verifier System for Generative Policies

関連論文リスト