Fugu-MT 論文翻訳(概要): HierEdit: Region-Aware Hierarchical Diffusion for Efficient High-Resolution Editing

論文の概要: HierEdit: Region-Aware Hierarchical Diffusion for Efficient High-Resolution Editing

arxiv url: http://arxiv.org/abs/2605.17294v1
Date: Sun, 17 May 2026 07:14:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:47.836369
Title: HierEdit: Region-Aware Hierarchical Diffusion for Efficient High-Resolution Editing
Title（参考訳）: HierEdit: 効率的な高分解能編集のための領域認識階層的拡散
Authors: Yuyao Zhang, Alexander Huang-Menders, Yu-Wing Tai,
Abstract要約: プロやクリエイティブなアプリケーションには高解像度の画像編集が不可欠である。現在のアプローチでは、イメージキャンバス全体を冗長に処理するか、大規模な高解像度データセットに依存している。高速かつスケーラブルな高解像度画像編集のための領域対応階層拡散フレームワークであるHierEditを紹介する。
参考スコア（独自算出の注目度）: 83.1290629939693
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: High-resolution image editing is essential for professional and creative applications, yet existing multimodal diffusion-based editors remain computationally inefficient and constrained to relatively low resolutions. Current approaches redundantly process the entire image canvas or rely on large-scale high-resolution datasets, resulting in substantial training and inference costs. We introduce HierEdit, a region-aware hierarchical diffusion framework designed for efficient and scalable high-resolution image editing. Our method first performs edits on a low-resolution proxy using an off-the-shelf editing model to generate a reference and to localize the modified regions. A hierarchical local-window diffusion model (\textbf{Local-Window MMDiT}) that refines only edited regions within the original high-res image, while reusing the unaltered regions as conditioning inputs. The low-resolution proxy further provides structural guidance and intermediate denoising supervision (\textbf{Inference Acceleration}) , ensuring consistent global semantics and stable generation without the need for full-resolution attention computation. This targeted and hierarchical design enables fast, high-fidelity editing of images up to 4K resolution without any specialized high-resolution training data. Extensive experiments demonstrate that HierEdit achieves competitive visual quality on commodity-resolution datasets while significantly accelerating inference and extending seamlessly to ultra-high-resolution 4K editing. Please check our {\href{https://peteryyzhang.github.io/HierEdit-page/}{\textbf{Project Page}}}.
Abstract（参考訳）: プロやクリエイティブなアプリケーションには高解像度の画像編集が不可欠であるが、既存のマルチモーダル拡散ベースのエディタは計算的に非効率であり、比較的低解像度に制限されている。現在のアプローチでは、イメージキャンバス全体を冗長に処理するか、大規模な高解像度データセットに依存しているため、相当なトレーニングと推論コストが発生する。高速でスケーラブルな高解像度画像編集のために設計された地域対応階層的拡散フレームワークであるHierEditを紹介する。提案手法は,まずオフザシェルフ編集モデルを用いて低解像度のプロキシ上で編集を行い,参照を生成し,修正領域をローカライズする。階層型局所ウィンドウ拡散モデル (\textbf{Local-Window MMDiT}) は、未修正領域を条件付け入力として再利用しながら、元の高解像度画像内の編集領域のみを洗練する。低解像度のプロキシはさらに、構造的なガイダンスと中間的記述監督(\textbf{Inference Acceleration})を提供し、完全解像度の注意計算を必要とせずに、一貫したグローバルセマンティクスと安定した生成を保証する。このターゲットで階層的な設計は、特別な高解像度のトレーニングデータなしで、4K解像度までの高速で高忠実な画像編集を可能にする。大規模な実験により、HierEditはコモディティ・レゾリューション・データセット上で競争力のある視覚的品質を実現し、推論を著しく加速し、超高解像度4K編集にシームレスに拡張することを示した。 https://peteryyzhang.github.io/HierEdit-page/}{\textbf{Project Page}}} をご覧ください。

論文の概要: HierEdit: Region-Aware Hierarchical Diffusion for Efficient High-Resolution Editing

関連論文リスト