Fugu-MT 論文翻訳(概要): InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling

論文の概要: InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling

arxiv url: http://arxiv.org/abs/2505.20600v1
Date: Tue, 27 May 2025 00:36:56 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-28 17:05:58.327971
Title: InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling
Title（参考訳）: InstGenIE: マスク対応キャッシングとスケジューリングを効果的に行う生成イメージ編集
Authors: Xiaoxiao Jiang, Suyi Li, Lingyun Yang, Tianyu Feng, Zhipeng Di, Weiyi Lu, Guoxuan Zhu, Xiu Lin, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang,
Abstract要約: InstGenIEは画像編集要求を効率的に処理するシステムである。 InstGenIEはスループットを最大3倍に向上し、平均要求遅延を14.7倍14.7倍に削減する。
参考スコア（独自算出の注目度）: 8.098417193586748
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative image editing using diffusion models has become a prevalent application in today's AI cloud services. In production environments, image editing typically involves a mask that specifies the regions of an image template to be edited. The use of masks provides direct control over the editing process and introduces sparsity in the model inference. In this paper, we present InstGenIE, a system that efficiently serves image editing requests. The key insight behind InstGenIE is that image editing only modifies the masked regions of image templates while preserving the original content in the unmasked areas. Driven by this insight, InstGenIE judiciously skips redundant computations associated with the unmasked areas by reusing cached intermediate activations from previous inferences. To mitigate the high cache loading overhead, InstGenIE employs a bubble-free pipeline scheme that overlaps computation with cache loading. Additionally, to reduce queuing latency in online serving while improving the GPU utilization, InstGenIE proposes a novel continuous batching strategy for diffusion model serving, allowing newly arrived requests to join the running batch in just one step of denoising computation, without waiting for the entire batch to complete. As heterogeneous masks induce imbalanced loads, InstGenIE also develops a load balancing strategy that takes into account the loads of both computation and cache loading. Collectively, InstGenIE outperforms state-of-the-art diffusion serving systems for image editing, achieving up to 3x higher throughput and reducing average request latency by up to 14.7x while ensuring image quality.
Abstract（参考訳）: 拡散モデルを使用した生成画像編集は、今日のAIクラウドサービスで広く使われているアプリケーションとなっている。プロダクション環境では、画像編集は通常、編集する画像テンプレートの領域を指定するマスクを含む。マスクの使用は、編集プロセスを直接制御し、モデル推論に空間性を導入する。本稿では,画像編集要求を効率的に行うシステムであるInstGenIEを提案する。 InstGenIEの背景にある重要な洞察は、画像編集がイメージテンプレートのマスキング領域のみを変更する一方で、未マスキー領域のオリジナルコンテンツを保存することである。この洞察に基づいて、InstGenIEは、以前の推論からキャッシュされた中間的アクティベーションを再利用することで、不正な領域に関連する冗長な計算を不正に省略する。高いキャッシュロードオーバーヘッドを軽減するため、InstGenIEでは、計算とキャッシュロードを重複させるバブルフリーパイプラインスキームを採用している。さらに、オンラインサービスにおけるキューレイテンシを低減し、GPU利用を改善するために、InstGenIEは拡散モデルサービスのための新しい継続的バッチ戦略を提案している。ヘテロジニアスマスクは不均衡な負荷を発生させるため、InstGenIEは計算とキャッシュのロードの両方の負荷を考慮したロードバランス戦略も開発している。まとめると、InstGenIEは画像編集の最先端拡散サービスシステムより優れ、3倍高いスループットを実現し、平均要求遅延を14.7倍まで削減し、画質を保証している。

関連論文リスト

AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing [33.74477787349966]
本研究では,AttentionDragという一段階のポイントベース画像編集手法を提案する。このフレームワークは、大規模な再最適化や再トレーニングを必要とせずに、セマンティック一貫性と高品質な操作を可能にする。以上の結果から,最先端の手法をはるかに高速に超越した性能を示す。
論文参考訳（メタデータ） (2025-06-16T09:42:38Z)
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation [108.69315278353932]
可変多層透明画像の直接生成を容易にするAnonymous Region Transformer(ART)を導入する。正確な制御とスケーラブルなレイヤ生成を可能にすることで、ARTはインタラクティブなコンテンツ作成のための新しいパラダイムを確立します。
論文参考訳（メタデータ） (2025-02-25T16:57:04Z)
Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation [70.95380821618711]
Dichotomous Image (DIS) タスクは高度に正確なアノテーションを必要とする。現在の生成モデルとテクニックは、シーンのずれ、ノイズによるエラー、限られたトレーニングサンプルの変動といった問題に苦慮している。多様な正確なデータセットを生成するためのスケーラブルなソリューションを提供する。
論文参考訳（メタデータ） (2024-12-26T06:37:25Z)
DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching [38.46235896192237]
私たちは、効率的で高品質なパーソナライズされた画像生成のためのスケーラブルなアプローチであるDreamCacheを紹介します。 DreamCacheは最先端の画像とテキストアライメントを実現し、桁違いに少ない余分なパラメータを使用する。
論文参考訳（メタデータ） (2024-11-26T15:03:14Z)
BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion [61.90969199199739]
BrushNetは、ピクセルレベルのマスク付きイメージ機能を事前訓練されたDMに埋め込むために設計された、新しいプラグアンドプレイデュアルブランチモデルである。 BrushNetは、画像品質、マスク領域保存、テキストコヒーレンスを含む7つの主要な指標で、既存のモデルよりも優れたパフォーマンスを実現している。
論文参考訳（メタデータ） (2024-03-11T17:59:31Z)
Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks [43.079272743475435]
本稿では、インスタント拡散編集(InstDiffEdit)と呼ばれるテキスト・ツー・イメージ(T2I)拡散モデルの新規で効率的な画像編集手法を提案する。特に、InstDiffEditは、既存の拡散モデルのクロスモーダルな注意力を活用して、拡散ステップ中に即時マスクガイダンスを実現することを目的としている。 DIEの既存の評価を補うため、既存の手法のマスク精度と局所的な編集能力を調べるためのEditing-Maskと呼ばれる新しいベンチマークを提案する。
論文参考訳（メタデータ） (2024-01-15T14:25:54Z)
Cache Me if You Can: Accelerating Diffusion Models through Block Caching [67.54820800003375]
画像間の大規模なネットワークは、ランダムノイズから画像を反復的に洗練するために、何度も適用されなければならない。ネットワーク内のレイヤの振る舞いを調査し,1) レイヤの出力が経時的にスムーズに変化すること,2) レイヤが異なる変更パターンを示すこと,3) ステップからステップへの変更が非常に小さいこと,などが分かる。本稿では,各ブロックの時間経過変化に基づいて,キャッシュスケジュールを自動的に決定する手法を提案する。
論文参考訳（メタデータ） (2023-12-06T00:51:38Z)
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation [78.13793505707952]
既存の自己回帰モデルは、まず画像再構成のための潜伏空間のコードブックを学習し、学習したコードブックに基づいて自己回帰的に画像生成を完了する2段階生成パラダイムに従っている。そこで本研究では,Masked Quantization VAE (MQ-VAE) Stackモデルを用いた2段階フレームワークを提案する。
論文参考訳（メタデータ） (2023-05-23T02:15:53Z)
DiffEdit: Diffusion-based semantic image editing with mask guidance [64.555930158319]
DiffEditは、セマンティック画像編集のタスクにテキスト条件付き拡散モデルを利用する方法である。私たちの主なコントリビューションは、編集が必要な入力画像の領域をハイライトするマスクを自動的に生成できることです。
論文参考訳（メタデータ） (2022-10-20T17:16:37Z)
Blended Latent Diffusion [18.043090347648157]
本稿では,汎用画像の局所的なテキスト駆動編集の課題に対して,ユーザが提供するマスクに所望の編集を限定する高速化されたソリューションを提案する。提案手法は,低次元の潜伏空間で操作することで拡散を高速化する,最近のテキストから画像への潜伏拡散モデル (LDM) を利用する。
論文参考訳（メタデータ） (2022-06-06T17:58:04Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。