Fugu-MT 論文翻訳(概要): Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression

論文の概要: Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression

arxiv url: http://arxiv.org/abs/2502.09971v1
Date: Fri, 14 Feb 2025 07:56:21 GMT
ステータス: 翻訳完了
システム内更新日: 2025-02-17 19:47:35.664716
Title: Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression
Title（参考訳）: 深部画像圧縮のための学習可能な合成基準付き条件付き潜時符号化
Authors: Siqi Wu, Yinda Chen, Dong Liu, Zhihai He,
Abstract要約: 本稿では,外部辞書から動的参照を合成して,潜時領域における入力画像の条件付き符号化を行う方法について検討する。各入力画像に対して、辞書から関連する特徴を選択して合成することで、条件付き潜伏語を合成することを学ぶ。次に、合成潜水剤を用いて符号化プロセスを導出し、入力画像と参照辞書の相関をより効率的に利用できるようにする。
参考スコア（独自算出の注目度）: 22.972154311937768
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study how to synthesize a dynamic reference from an external dictionary to perform conditional coding of the input image in the latent domain and how to learn the conditional latent synthesis and coding modules in an end-to-end manner. Our approach begins by constructing a universal image feature dictionary using a multi-stage approach involving modified spatial pyramid pooling, dimension reduction, and multi-scale feature clustering. For each input image, we learn to synthesize a conditioning latent by selecting and synthesizing relevant features from the dictionary, which significantly enhances the model's capability in capturing and exploring image source correlation. This conditional latent synthesis involves a correlation-based feature matching and alignment strategy, comprising a Conditional Latent Matching (CLM) module and a Conditional Latent Synthesis (CLS) module. The synthesized latent is then used to guide the encoding process, allowing for more efficient compression by exploiting the correlation between the input image and the reference dictionary. According to our theoretical analysis, the proposed conditional latent coding (CLC) method is robust to perturbations in the external dictionary samples and the selected conditioning latent, with an error bound that scales logarithmically with the dictionary size, ensuring stability even with large and diverse dictionaries. Experimental results on benchmark datasets show that our new method improves the coding performance by a large margin (up to 1.2 dB) with a very small overhead of approximately 0.5\% bits per pixel. Our code is publicly available at https://github.com/ydchen0806/CLC.
Abstract（参考訳）: 本稿では,外部辞書から動的参照を合成して潜時領域における入力画像の条件付き符号化を行う方法と,条件付き潜時合成および符号化モジュールをエンドツーエンドに学習する方法について検討する。提案手法は,空間的ピラミッドプール,次元縮小,マルチスケール特徴クラスタリングを含む多段階アプローチを用いて,ユニバーサル画像特徴辞書の構築から始まる。各入力画像に対して,関係する特徴を辞書から選択・合成することで条件付き潜伏語を合成することを学び,画像ソース相関を捕捉・探索するモデルの能力を大幅に向上させる。この条件付きラテント合成は、相関に基づく特徴マッチングとアライメント戦略を含み、条件付きラテントマッチング(CLM)モジュールと条件付きラテント合成(CLS)モジュールからなる。次に、合成潜水剤を用いて符号化プロセスを導出し、入力画像と参照辞書の相関を利用してより効率的な圧縮を可能にする。理論的解析により,提案手法は外部辞書サンプルと選択された条件付潜時に対して頑健であり,辞書サイズと対数的にスケールし,大規模かつ多種多様な辞書でも安定性を確保できる誤差境界を持つ。ベンチマーク・データセットを用いた実験結果から,新しい手法は1ピクセルあたり0.5倍のオーバーヘッドで大きなマージン(最大1.2dB)で符号化性能を向上することが示された。私たちのコードはhttps://github.com/ydchen0806/CLCで公開されています。

論文の概要: Conditional Latent Coding with Learnable Synthesized Reference for Deep Image Compression

関連論文リスト