Fugu-MT 論文翻訳(概要): Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation

論文の概要: Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation

arxiv url: http://arxiv.org/abs/2603.19360v1
Date: Thu, 19 Mar 2026 18:00:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 19:48:38.826753
Title: Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation
Title（参考訳）: 高速テキスト/画像生成のためのワームスタートフローマッチング
Authors: Minyoung Kim,
Abstract要約: 本稿では,フローマッチングアルゴリズムのサンプル生成時間を保証されたスピードアップ係数により削減する新しい手法を提案する。私たちのアイデアは、基本的には、低品質のドラフトサンプルから高品質のサンプルまで、学習から再定義する生成モデルとして見ることができます。
参考スコア（独自算出の注目度）: 7.945670209718052
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current auto-regressive (AR) LLMs, diffusion-based text/image generative models, and recent flow matching (FM) algorithms are capable of generating premium quality text/image samples. However, the inference or sample generation in these models is often very time-consuming and computationally demanding, mainly due to large numbers of function evaluations corresponding to the lengths of tokens or the numbers of diffusion steps. This also necessitates heavy GPU resources, time, and electricity. In this work we propose a novel solution to reduce the sample generation time of flow matching algorithms by a guaranteed speed-up factor, without sacrificing the quality of the generated samples. Our key idea is to utilize computationally lightweight generative models whose generation time is negligible compared to that of the target AR/FM models. The draft samples from a lightweight model, whose quality is not satisfactory but fast to generate, are regarded as an initial distribution for a FM algorithm. Unlike conventional usage of FM that takes a pure noise (e.g., Gaussian or uniform) initial distribution, the draft samples are already of decent quality, so we can set the starting time to be closer to the end time rather than 0 in the pure noise FM case. This will significantly reduce the number of time steps to reach the target data distribution, and the speed-up factor is guaranteed. Our idea, dubbed {\em Warm-Start FM} or WS-FM, can essentially be seen as a {\em learning-to-refine} generative model from low-quality draft samples to high-quality samples. As a proof of concept, we demonstrate the idea on some synthetic toy data as well as real-world text and image generation tasks, illustrating that our idea offers guaranteed speed-up in sample generation without sacrificing the quality of the generated samples.
Abstract（参考訳）: 現在の自己回帰(AR)LLM、拡散ベースのテキスト/画像生成モデル、および最近のフローマッチング(FM)アルゴリズムは高品質なテキスト/画像サンプルを生成することができる。しかしながら、これらのモデルにおける推論やサンプル生成は、主にトークンの長さや拡散ステップの数に対応する多数の関数評価のために、非常に時間がかかり、計算的に要求されることが多い。また、重いGPUリソース、時間、電気も必要です。本研究では,フローマッチングアルゴリズムのサンプル生成時間を,生成したサンプルの品質を犠牲にすることなく,保証されたスピードアップ係数によって削減する新しい手法を提案する。我々のキーとなる考え方は、ターゲットのAR/FMモデルと比べて生成時間が無視できる計算軽量な生成モデルを利用することである。 FMアルゴリズムの初期分布として,品質が満足できないが高速に生成できる軽量モデルからのドラフトサンプルが検討されている。純雑音(例えばガウスあるいは一様)を初期分布に用いた従来のFMとは違って、ドラフトサンプルは良好な品質であり、純粋なノイズFMの場合では0よりも0に近いタイミングで開始時間を設定できる。これにより、ターゲットデータ分布に到達するための時間ステップが大幅に削減され、スピードアップ係数が保証される。我々のアイデアは、基本的には、低品質のドラフトサンプルから高品質のサンプルまで、学習から精製までの生成モデルと見なすことができる。概念実証として,合成玩具データと実世界のテキストおよび画像生成タスクのアイデアを実証し,本アイデアが生成したサンプルの品質を犠牲にすることなく,サンプル生成における保証されたスピードアップを提供することを示した。

論文の概要: Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation

関連論文リスト