Fugu-MT 論文翻訳(概要): Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field

論文の概要: Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field

arxiv url: http://arxiv.org/abs/2605.16348v1
Date: Fri, 08 May 2026 04:03:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 12:34:33.904868
Title: Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field
Title（参考訳）: 流れ方向:非パラメトリック誘導場による流れモデルのフィードバック効率と再利用可能な誘導
Authors: Kim Yong Tan, Yueming Lyu, Ivor Tsang, Yew-Soon Ong,
Abstract要約: トレーニング不要のガイダンスにより、事前トレーニングされた拡散とフローモデルにより、アプリケーション固有の目的を最適化できる。本稿では、永続的なガイダンスフィールドを通じて生成プロセスをガイドするフレームワークであるFlow-Directを提案する。
参考スコア（独自算出の注目度）: 42.27623008321844
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training-free guidance enables pre-trained diffusion and flow models to optimize application-specific objectives using feedback from external black-box reward functions. However, existing methods are feedback-inefficient because reward feedback is used only transiently to inform a localized gradient approximation or a discrete search decision, and is subsequently discarded. To address this limitation, we propose Flow-Direct, a framework that guides the generation process via a persistent guidance field. Theoretically, this guidance field is analytically derived from the log-density ratio between the base and reward-weighted target distributions; it transports the pre-trained distribution to the target distribution. In practice, the field is implemented as a non-parametric estimator constructed from all accumulated reward-evaluated samples. As more samples are collected during optimization, this empirical guidance field becomes increasingly accurate. This persistent formulation yields two major advantages. First, Flow-Direct is highly feedback-efficient: because every evaluated sample is used to refine the global guidance field, no reward information is wasted. Second, the framework is naturally reusable: once optimization is complete, the collected dataset defines a reusable guidance field for generating novel target samples without additional reward evaluations, and distinct guidance fields can be combined to generate samples that simultaneously satisfy multiple objectives.
Abstract（参考訳）: トレーニングフリーガイダンスにより、トレーニング済みの拡散とフローモデルにより、外部のブラックボックス報酬関数からのフィードバックを使用して、アプリケーション固有の目的を最適化できる。しかし、報酬フィードバックは局所的な勾配近似や離散的な探索決定を伝達するためにのみ過渡的にのみ使用されるため、既存の手法はフィードバック非効率である。この制限に対処するために、永続的なガイダンスフィールドを通じて生成プロセスをガイドするフレームワークであるFlow-Directを提案する。理論的には、この誘導場は、ベースと報酬重み付けされた目標分布の対数密度比から解析的に導出され、事前学習された分布を目標分布へ輸送する。実際には、このフィールドは、蓄積されたすべての報酬評価サンプルから構築された非パラメトリック推定器として実装される。最適化中により多くのサンプルが収集されるにつれて、この経験的ガイダンスフィールドはますます正確になる。この持続的な定式化は2つの大きな利点をもたらす。第一に、Flow-Directは非常にフィードバック効率が良く、評価されたすべてのサンプルがグローバルガイダンスの分野を洗練するために使用されるため、報酬情報は無駄にされない。第2に、このフレームワークは自然に再利用されている: 最適化が完了すると、収集されたデータセットは、追加の報酬評価なしで新規ターゲットサンプルを生成する再利用可能なガイダンスフィールドを定義し、異なるガイダンスフィールドを組み合わせて複数の目的を同時に満たすサンプルを生成することができる。

論文の概要: Flow-Direct: Feedback-Efficient and Reusable Guidance for Flow Models via Non-Parametric Guidance Field

関連論文リスト