Fugu-MT 論文翻訳(概要): Fine-Tuning Flow Matching via Maximum Likelihood Estimation of Reconstructions

論文の概要: Fine-Tuning Flow Matching via Maximum Likelihood Estimation of Reconstructions

arxiv url: http://arxiv.org/abs/2510.02081v1
Date: Thu, 02 Oct 2025 14:49:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-03 16:59:21.169878
Title: Fine-Tuning Flow Matching via Maximum Likelihood Estimation of Reconstructions
Title（参考訳）: 最大形状推定による微調整流のマッチング
Authors: Zhaoyi Li, Jingtao Ding, Yong Li, Shihua Li,
Abstract要約: フローマッチング(FM)アルゴリズムは、特にロボット操作において、生成タスクにおいて顕著な結果をもたらす。本稿では,FMにおけるトレーニング損失と推論誤差の関係を理論的に解析する。そこで本研究では,再構成の最大精度推定によるFM微調整手法を提案する。
参考スコア（独自算出の注目度）: 20.26227575771028
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Flow Matching (FM) algorithm achieves remarkable results in generative tasks especially in robotic manipulation. Building upon the foundations of diffusion models, the simulation-free paradigm of FM enables simple and efficient training, but inherently introduces a train-inference gap. Specifically, we cannot assess the model's output during the training phase. In contrast, other generative models including Variational Autoencoder (VAE), Normalizing Flow and Generative Adversarial Networks (GANs) directly optimize on the reconstruction loss. Such a gap is particularly evident in scenarios that demand high precision, such as robotic manipulation. Moreover, we show that FM's over-pursuit of straight predefined paths may introduce some serious problems such as stiffness into the system. These motivate us to fine-tune FM via Maximum Likelihood Estimation of reconstructions - an approach made feasible by FM's underlying smooth ODE formulation, in contrast to the stochastic differential equations (SDEs) used in diffusion models. This paper first theoretically analyzes the relation between training loss and inference error in FM. Then we propose a method of fine-tuning FM via Maximum Likelihood Estimation of reconstructions, which includes both straightforward fine-tuning and residual-based fine-tuning approaches. Furthermore, through specifically designed architectures, the residual-based fine-tuning can incorporate the contraction property into the model, which is crucial for the model's robustness and interpretability. Experimental results in image generation and robotic manipulation verify that our method reliably improves the inference performance of FM.
Abstract（参考訳）: フローマッチング(FM)アルゴリズムは、特にロボット操作において、生成タスクにおいて顕著な結果をもたらす。拡散モデルの基礎を基礎として、FMのシミュレーションのないパラダイムは、単純で効率的なトレーニングを可能にするが、本質的には列車の干渉ギャップを導入している。具体的には、トレーニングフェーズにおけるモデルのアウトプットを評価することはできない。対照的に、可変オートエンコーダ(VAE)、正規化フロー、GAN(Generative Adversarial Networks)といった他の生成モデルは、再構築損失を直接最適化する。このようなギャップは、ロボット操作のような高精度を必要とするシナリオで特に顕著である。さらに, FMの事前定義された経路の過度な取得は, システムに剛性などの深刻な問題を生じさせる可能性が示唆された。これらのことは、拡散モデルで用いられる確率微分方程式(SDE)とは対照的に、FMの基盤となる滑らかなODE定式化によって実現可能なアプローチである。本稿ではまず,FMにおけるトレーニング損失と推論誤差の関係を理論的に解析する。そこで本研究では, 簡単な微調整法と残差に基づく微調整法の両方を含む, 最大形状推定によるFM微調整手法を提案する。さらに、特別に設計されたアーキテクチャにより、残差ベースの微調整はモデルの堅牢性と解釈可能性にとって重要な収縮特性をモデルに組み込むことができる。画像生成とロボット操作の実験結果から,提案手法がFMの推論性能を確実に向上することを確認した。

論文の概要: Fine-Tuning Flow Matching via Maximum Likelihood Estimation of Reconstructions

関連論文リスト