Fugu-MT 論文翻訳(概要): CADC: Content Adaptive Diffusion-Based Generative Image Compression

論文の概要: CADC: Content Adaptive Diffusion-Based Generative Image Compression

arxiv url: http://arxiv.org/abs/2602.21591v1
Date: Wed, 25 Feb 2026 05:35:45 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-26 18:19:16.714259
Title: CADC: Content Adaptive Diffusion-Based Generative Image Compression
Title（参考訳）: CADC:コンテンツ適応拡散に基づく生成画像圧縮
Authors: Xihua Sheng, Lingyu Zhu, Tianyu Zhang, Dong Liu, Shiqi Wang, Jing Wang,
Abstract要約: 本稿では,3つの技術革新を伴うコンテンツ適応拡散に基づく画像を提案する。 1) 空間不確実性マップを学習し、量子化歪みをコンテンツ特性と適応的に整合させる不確実性誘導適応量子化法。 2) 補助的復号器を用いた補助的復号器を用いた補助的復号器誘導型情報集中法により, 一次潜航路におけるコンテンツ認識情報保存を行う。
参考スコア（独自算出の注目度）: 22.243145406394927
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion-based generative image compression has demonstrated remarkable potential for achieving realistic reconstruction at ultra-low bitrates. The key to unlocking this potential lies in making the entire compression process content-adaptive, ensuring that the encoder's representation and the decoder's generative prior are dynamically aligned with the semantic and structural characteristics of the input image. However, existing methods suffer from three critical limitations that prevent effective content adaptation. First, isotropic quantization applies a uniform quantization step, failing to adapt to the spatially varying complexity of image content and creating a misalignment with the diffusion model's noise-dependent prior. Second, the information concentration bottleneck -- arising from the dimensional mismatch between the high-dimensional noisy latent and the diffusion decoder's fixed input -- prevents the model from adaptively preserving essential semantic information in the primary channels. Third, existing textual conditioning strategies either need significant textual bitrate overhead or rely on generic, content-agnostic textual prompts, thereby failing to provide adaptive semantic guidance efficiently. To overcome these limitations, we propose a content-adaptive diffusion-based image codec with three technical innovations: 1) an Uncertainty-Guided Adaptive Quantization method that learns spatial uncertainty maps to adaptively align quantization distortion with content characteristics; 2) an Auxiliary Decoder-Guided Information Concentration method that uses a lightweight auxiliary decoder to enforce content-aware information preservation in the primary latent channels; and 3) a Bitrate-Free Adaptive Textual Conditioning method that derives content-aware textual descriptions from the auxiliary reconstructed image, enabling semantic guidance without bitrate cost.
Abstract（参考訳）: 拡散に基づく生成画像圧縮は、超低ビットレートで現実的な再構成を実現するための顕著な可能性を示している。この可能性を解き放つ鍵は、圧縮プロセス全体をコンテント適応させることであり、エンコーダの表現とデコーダの生成前の生成が入力画像の意味的および構造的特性と動的に一致していることを保証する。しかし、既存の手法は、効果的なコンテンツ適応を妨げる3つの限界に悩まされている。第一に、等方的量子化は均一な量子化ステップを適用し、空間的に変化する画像内容の複雑さに適応できず、拡散モデルのノイズ依存の先行と不一致を生じさせる。第二に、高次元ノイズ潜水器と拡散復号器の固定入力との次元的ミスマッチから生じる情報集中ボトルネックは、モデルが一次チャネルにおいて本質的な意味情報を適応的に保存するのを防ぐ。第3に、既存のテキストコンディショニング戦略は、大きなテキストビットレートオーバーヘッドを必要とするか、一般的なコンテンツに依存しないテキストプロンプトに依存しているため、適応的なセマンティックガイダンスを効率的に提供できない。これらの制限を克服するために,3つの技術革新を伴うコンテンツ適応拡散型画像コーデックを提案する。 1) 空間不確実性マップを学習し、量子化歪みをコンテンツ特性と適応的に整合させる不確実性誘導適応量子化法 2 軽量補助復号機を用いて一次潜航路におけるコンテンツ認識情報保存を行う補助復号機誘導情報集中方法 3) ビットレートのコストを伴わないセマンティックガイダンスを実現するため, コンテンツ対応のテキスト記述を補助的再構成画像から導出するビットレートフリー適応テキストコンディショニング手法を提案する。

関連論文リスト

Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution [59.71803719801537]
CODSRは、画像超解像のための制御可能なワンステップ拡散ネットワークである。拡散過程に高忠実度条件を与えるLQ誘導型特徴変調モジュールを提案する。そこで我々は,知覚の豊かさを効果的に向上させるために,領域適応型事前活性化法を開発した。
論文参考訳（メタデータ） (2025-12-16T03:56:02Z)
Content Adaptive based Motion Alignment Framework for Learned Video Compression [72.13599533975413]
本稿では,コンテンツ適応型モーションアライメントフレームワークを提案する。まず、粗いオフセット予測とマスク変調により動き補償を洗練させる2段階の流動誘導変形防止機構を導入する。第2に,基準品質に基づいて歪み重みを調整するマルチ参照品質認識戦略を提案する。第3に,スムーズな動き推定を得るために,フレームを大きさと解像度でダウンサンプルするトレーニングフリーモジュールを統合する。
論文参考訳（メタデータ） (2025-12-15T02:51:47Z)
PosDiffAE: Position-aware Diffusion Auto-encoder For High-Resolution Brain Tissue Classification Incorporating Artifact Restoration [0.5442686600296733]
脳画像における領域特異的な細胞パターンを認識するために,拡散自己符号化モデルの潜時空間を構造化する機構を考案する。また,非教師付き催涙物復元手法を考案し,推論時の潜伏表現と拡散モデルの制約付き生成能力を利用した。
論文参考訳（メタデータ） (2025-07-03T07:58:53Z)
Efficient Semantic Communication Through Transformer-Aided Compression [31.285983939625098]
セマンティックコミュニケーションのためのチャネル対応適応フレームワークを提案する。視覚変換器を用いて、パッチの意味的内容の尺度として注意マスクを解釈する。本手法は,エンコード解像度をコンテンツ関連性に適応させることで通信効率を向上させる。
論文参考訳（メタデータ） (2024-12-02T18:57:28Z)
Training-free Composite Scene Generation for Layout-to-Image Synthesis [29.186425845897947]
本稿では,拡散条件下での対角的意味交叉を克服するために,新しい学習自由アプローチを提案する。本稿では,(1)トークン競合を解消し,正確な概念合成を保証するためのトークン間制約,2)画素間関係を改善する自己注意制約,という2つの革新的な制約を提案する。本評価では,拡散過程の導出にレイアウト情報を活用することで,忠実度と複雑さを向上したコンテンツリッチな画像を生成することの有効性を確認した。
論文参考訳（メタデータ） (2024-07-18T15:48:07Z)
Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
本稿では,圧縮ビデオの品質向上の課題に焦点をあてる。既存の手法のほとんどは、圧縮コーデック内での事前処理を最適に活用するための構造設計を欠いている。新しいパラダイムは、より意識的な品質向上プロセスのために緊急に必要である。
論文参考訳（メタデータ） (2024-05-10T09:18:17Z)
Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior [8.772652777234315]
本稿では,事前学習した拡散モデルの強力な生成能力を生かした,新しい2段階の極端画像圧縮フレームワークを提案する。本手法は, 視覚的性能を極端に低め, 最先端の手法よりも優れていた。
論文参考訳（メタデータ） (2024-04-29T16:02:38Z)
HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression [51.04820313355164]
HyrbidFlowは、連続的な機能ベースのストリームとコードブックベースのストリームを組み合わせることで、極めて低い条件下で高い知覚品質と高い忠実性を実現する。実験の結果、超低速で複数のデータセットにまたがる優れた性能が示された。
論文参考訳（メタデータ） (2024-04-20T13:19:08Z)
Content Adaptive Latents and Decoder for Neural Image Compression [31.03018315147814]
本稿では,ラテントとデコーダの両方におけるコンテンツ適応性を改善する新しいNICフレームワークを提案する。具体的には,潜伏者の冗長性を取り除くために,潜伏者の最適な品質レベルを自動的に選択するコンテンツ適応チャネルドロップ法(CACD)を提案する。また、デコーダ側のコンテンツ適応性を改善するために、コンテンツ適応型特徴変換法(CAFT)を提案する。
論文参考訳（メタデータ） (2022-12-20T10:01:23Z)
Learned Video Compression via Heterogeneous Deformable Compensation Network [78.72508633457392]
不安定な圧縮性能の問題に対処するために,不均一変形補償戦略(HDCVC)を用いた学習ビデオ圧縮フレームワークを提案する。より具体的には、提案アルゴリズムは隣接する2つのフレームから特徴を抽出し、コンテンツ近傍の不均一な変形(HetDeform)カーネルオフセットを推定する。実験結果から,HDCVCは最近の最先端の学習ビデオ圧縮手法よりも優れた性能を示した。
論文参考訳（メタデータ） (2022-07-11T02:31:31Z)
Modeling Lost Information in Lossy Image Compression [72.69327382643549]
ロスシー画像圧縮は、デジタル画像の最もよく使われる演算子の1つである。 Invertible Lossy Compression (ILC) と呼ばれる新しい非可逆的フレームワークを提案する。
論文参考訳（メタデータ） (2020-06-22T04:04:56Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。