Fugu-MT 論文翻訳(概要): Identity Encoder for Personalized Diffusion

論文の概要: Identity Encoder for Personalized Diffusion

arxiv url: http://arxiv.org/abs/2304.07429v1
Date: Fri, 14 Apr 2023 23:32:24 GMT
ステータス: 翻訳完了
システム内更新日: 2023-04-18 19:11:56.916397
Title: Identity Encoder for Personalized Diffusion
Title（参考訳）: 個人化拡散のためのアイデンティティエンコーダ
Authors: Yu-Chuan Su, Kelvin C.K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia
Abstract要約: パーソナライズのためのエンコーダに基づくアプローチを提案する。我々は、被写体の参照画像の集合からアイデンティティ表現を抽出できるアイデンティティエンコーダを学習する。提案手法は画像生成と再構成の両方において既存の微調整に基づくアプローチより一貫して優れていることを示す。
参考スコア（独自算出の注目度）: 57.1198884486401
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many applications can benefit from personalized image generation models, including image enhancement, video conferences, just to name a few. Existing works achieved personalization by fine-tuning one model for each person. While being successful, this approach incurs additional computation and storage overhead for each new identity. Furthermore, it usually expects tens or hundreds of examples per identity to achieve the best performance. To overcome these challenges, we propose an encoder-based approach for personalization. We learn an identity encoder which can extract an identity representation from a set of reference images of a subject, together with a diffusion generator that can generate new images of the subject conditioned on the identity representation. Once being trained, the model can be used to generate images of arbitrary identities given a few examples even if the model hasn't been trained on the identity. Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications. Empirical results show that our approach consistently outperforms existing fine-tuning based approach in both image generation and reconstruction, and the outputs is preferred by users more than 95% of the time compared with the best performing baseline.
Abstract（参考訳）: 多くのアプリケーションは、画像拡張やビデオ会議など、パーソナライズされた画像生成モデルの恩恵を受けることができる。既存の作品は、個人ごとに1つのモデルを微調整することでパーソナライズされた。このアプローチは成功したが、新しいアイデンティティごとに計算とストレージのオーバーヘッドが増大する。さらに、最高のパフォーマンスを達成するために、通常、アイデンティティ毎に数十から数百のサンプルを期待します。これらの課題を克服するために,パーソナライズのためのエンコーダベースのアプローチを提案する。我々は、被写体の参照画像の集合からアイデンティティ表現を抽出できるアイデンティティエンコーダと、該アイデンティティ表現に条件付けられた被写体の新たな画像を生成する拡散生成器とを学習する。トレーニングが完了すると、モデルがIDに基づいてトレーニングされていなくても、いくつかの例から任意のIDの画像を生成するためにモデルを使用できる。我々のアプローチは、パーソナライズされた画像生成のオーバーヘッドを大幅に減らし、多くの潜在的なアプリケーションに適用できる。実験結果から,提案手法は画像生成と再構成の両方において既存の微調整手法より一貫して優れており,処理時間の95%以上をユーザが好んでいることがわかった。

論文の概要: Identity Encoder for Personalized Diffusion

関連論文リスト