Fugu-MT 論文翻訳(概要): SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation

論文の概要: SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation

arxiv url: http://arxiv.org/abs/2605.18267v2
Date: Sat, 23 May 2026 08:45:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 16:32:37.66578
Title: SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation
Title（参考訳）: SRC-Flow:画像生成のための正規化フローを可能にするコンパクトセマンティック表現
Authors: Longtao Jiang, Jianmin Bao, Zhendong Wang, Xin Tao, Pengfei Wan, Zhihui Li, Xiaojun Chang,
Abstract要約: 正規化フロー(NF)は、正確な確率と決定論的非可逆サンプリングを提供するが、大規模な画像生成のための拡散モデルに遅れを取っている。低次元意味空間にコンパクトな高次元RAE特徴にセマンティック表現(SRC)を導入するSRC-Flowを提案する。 SRC-Flowは、計算不要なガイダンスの下で、gFIDスコア1.65と2.07で、フローメソッド間の最先端の生成品質を実現する。
参考スコア（独自算出の注目度）: 73.51436199324066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Normalizing flows (NFs) provide exact likelihoods and deterministic invertible sampling, but have historically lagged behind diffusion models for large-scale image generation. We identify a key obstacle: NFs are required to learn a single invertible transport over the full ambient space, making them highly sensitive to high-dimensional representations. This leads to a semantic-capacity mismatch in modern visual representation spaces, where semantic information is compact but encoded in overcomplete features. We propose SRC-Flow, which introduces a Semantic Representation Compressor (SRC) to compact high-dimensional RAE features into a low-dimensional semantic space before flow modeling and preserve reconstruction through the frozen RAE decoder. This compact space reduces the modeling burden of NFs and enables effective likelihood-based generation in semantic representation space. We further adopt constant noise regularization tailored to the fixed unconditional bijection learned by flows. On ImageNet $256 \times 256$ and $512 \times 512$, SRC-Flow achieves state-of-the-art generation quality among normalizing flow methods, with gFID scores of 1.65 and 2.07 under classifier-free guidance, while retaining exact likelihood computation in the compact semantic representation space and deterministic invertible sampling at the flow level. Codes and models will be available at https://github.com/longtaojiang/SRC-Flow.
Abstract（参考訳）: 正規化フロー(NF)は正確な確率と決定論的非可逆サンプリングを提供するが、歴史的に大規模な画像生成のための拡散モデルに遅れを取ってきた。 NFは、全周囲空間上の単一の可逆輸送を学習し、高次元表現に非常に敏感である。これは、セマンティック情報がコンパクトだがオーバーコンプリートな特徴に符号化される現代の視覚表現空間における意味-能力のミスマッチにつながる。本稿では,SRC-Flowを提案する。このSRC-Flowは,SRC(Semantic Representation Compressor)を導入し,高次元RAE特徴を低次元意味空間に拡張し,フローモデリングを行い,凍結したRAEデコーダによる再構成を保存する。このコンパクト空間は、NFのモデリング負担を低減し、意味表現空間における効果的な可能性ベースの生成を可能にする。さらに,流れから学習した固定的無条件単射に合わせた定値雑音正規化を適用した。 ImageNet 256 \times 256$と512 \times 512$では、SRC-Flowは正規化フロー法で最先端の生成品質を達成し、gFIDスコアは1.65と2.07であり、コンパクトな意味表現空間において正確な精度計算を保ち、フローレベルで決定論的非可逆サンプリングを行う。コードとモデルはhttps://github.com/longtaojiang/SRC-Flow.comから入手できる。

論文の概要: SRC-Flow: Compact Semantic Representations Enable Normalizing Flows for Image Generation

関連論文リスト