Fugu-MT 論文翻訳(概要): Test-Time Compositional Generalization in Diffusion Models via Concept Discovery

論文の概要: Test-Time Compositional Generalization in Diffusion Models via Concept Discovery

arxiv url: http://arxiv.org/abs/2605.07078v1
Date: Fri, 08 May 2026 00:53:53 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.704086
Title: Test-Time Compositional Generalization in Diffusion Models via Concept Discovery
Title（参考訳）: 概念発見による拡散モデルにおけるテスト時間構成一般化
Authors: Zekun Wang, Anant Gupta, Tianyi Zhu, Christopher J. MacLellan,
Abstract要約: 事前学習された拡散モデルにより,学習した時間インデクシングスコアからクエリ固有の概念が検出可能であることを示す。分析用PoEサンプリングと低ランク適応モデルの両方がクエリ専用で、最も訓練されたクラスベースラインより優れている。
参考スコア（独自算出の注目度）: 6.379257030501549
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Compositional generalization requires models to produce novel configurations from familiar parts. In diffusion models, prior compositional generation methods typically assume that the relevant concepts or conditioning signals are already available. We instead ask whether a pretrained diffusion model can discover query-specific concepts from the time-indexed scores it learns for the noisy marginals $p_t(x_t)$ and compose them at test time. Given a single out-of-distribution query, our method performs gradient ascent on $s_θ(x_t,t) \approx \nabla_{x_t}\log p_t(x_t)$ at multiple noising timesteps to recover local density modes, maps these modes into clean-space Gaussians, greedily selects relevant prototypes with a submodular likelihood objective, and combines them into a product-of-experts (PoE) teacher model with an analytic score. This teacher model can be sampled directly through classifier-free guidance or used to generate a sample pool for training a new class embedding and low-rank adapter. On held-out composition benchmarks built from ColorMNIST and CelebA, both the analytic PoE sampler and the low-rank adapted model outperform query-only and nearest trained-class baselines. These results suggest that the time-indexed score geometry of the diffusion model contains reusable density-mode concepts that support test-time compositional generation without a predefined concept library.
Abstract（参考訳）: 構成一般化は、慣れ親しんだ部分から新しい構成を生成するモデルを必要とする。拡散モデルでは、初期構成生成法は一般的に、関連する概念や条件付け信号が既に利用可能であると仮定する。その代わりに、事前訓練された拡散モデルが、ノイズの多い辺りの$p_t(x_t)$で学習した時間インデクシングスコアからクエリ固有の概念を発見でき、テスト時にそれらを構成することができるかどうかを問う。一つのアウト・オブ・ディストリビューション・クエリが与えられた場合,本手法は,局所密度モードを復元し,これらのモードをクリーン空間のガウスアンにマップし,関連するプロトタイプをサブモジュラーな目的で選択的に選択し,それらを分析スコア付き製品・オブ・エキスパート(PoE)教師モデルに結合する,という複数のノイズ発生時間ステップで, $s_θ(x_t,t) \approx \nabla_{x_t}\log p_t(x_t)$で勾配上昇を実行する。この教師モデルは、分類器フリーガイダンスを介して直接サンプリングしたり、新しいクラス埋め込みと低ランクアダプタをトレーニングするためのサンプルプールを生成するために使用することができる。 ColorMNISTとCelebAから構築されたホールドアウトコンポジションベンチマークでは、分析PoEサンプリングと低ランク適応モデルの両方がクエリオンリーで、最も訓練されたクラスベースラインより優れている。これらの結果から,拡散モデルの時間インデクシングスコア幾何は,事前定義された概念ライブラリを使わずに,テスト時間構成生成をサポートする再利用可能な密度モードの概念を含んでいることが示唆された。

論文の概要: Test-Time Compositional Generalization in Diffusion Models via Concept Discovery

関連論文リスト