Fugu-MT 論文翻訳(概要): ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

論文の概要: ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

arxiv url: http://arxiv.org/abs/2305.16225v3
Date: Thu, 7 Dec 2023 07:56:52 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-08 18:47:08.949130
Title: ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models
Title（参考訳）: 拡散モデルの属性認識パーソナライズのためのプロンプトスペクトル
Authors: Yuxin Zhang, Weiming Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu
Abstract要約: 現在のパーソナライズ手法は、オブジェクトや概念をテキスト条件空間に反転させ、テキストから画像への拡散モデルのための新しい自然文を構成することができる。本稿では,低周波情報から高周波画像を生成する拡散モデルのステップバイステップ生成プロセスを活用する新しい手法を提案する。 ProSpectは、画像誘導やテキスト駆動による材料、スタイル、レイアウトの操作など、パーソナライズされた属性認識画像生成アプリケーションに適用する。
参考スコア（独自算出の注目度）: 77.03361270726944
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Personalizing generative models offers a way to guide image generation with user-provided references. Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models. However, representing and editing specific visual attributes such as material, style, and layout remains a challenge, leading to a lack of disentanglement and editability. To address this problem, we propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information, providing a new perspective on representing, generating, and editing images. We develop the Prompt Spectrum Space P*, an expanded textual conditioning space, and a new image representation method called \sysname. ProSpect represents an image as a collection of inverted textual token embeddings encoded from per-stage prompts, where each prompt corresponds to a specific generation stage (i.e., a group of consecutive steps) of the diffusion model. Experimental results demonstrate that P* and ProSpect offer better disentanglement and controllability compared to existing methods. We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models. Our source code is available athttps://github.com/zyxElsa/ProSpect.
Abstract（参考訳）: 生成モデルのパーソナライズは、ユーザが提供する参照で画像生成をガイドする方法を提供する。現在のパーソナライズ手法は、オブジェクトや概念をテキスト条件空間に反転させ、テキストから画像への拡散モデルのための新しい自然文を構成することができる。しかし、素材、スタイル、レイアウトなどの特定の視覚的属性の表現と編集は依然として課題であり、絡み合いや編集性が欠如している。そこで本研究では,低周波情報から高周波画像を生成する拡散モデルのステップ・バイ・ステップ生成プロセスを利用して,画像の表現,生成,編集に関する新たな視点を提供する。本稿では,拡張テキスト条件空間であるPrompt Spectrum Space P*と,新しい画像表現法であるShasysnameを開発した。 ProSpectは、各プロンプトが拡散モデルの特定の生成段階(つまり連続的なステップのグループ)に対応する段階ごとのプロンプトから符号化された逆テキストトークン埋め込みの集合として画像を表す。実験の結果、p* と prospect は既存の方法と比較してより良い乱れと制御性を示している。画像誘導やテキスト駆動による素材, スタイル, レイアウトの操作など, パーソナライズされた属性認識型画像生成アプリケーションの展望を適用し, 拡散モデルを微調整することなく, 単一の画像入力からこれまで達成できなかった結果を得る。ソースコードはhttps://github.com/zyxElsa/ProSpect.comで公開されています。

論文の概要: ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models

関連論文リスト