Fugu-MT 論文翻訳(概要): X2HDR: HDR Image Generation in a Perceptually Uniform Space

論文の概要: X2HDR: HDR Image Generation in a Perceptually Uniform Space

arxiv url: http://arxiv.org/abs/2602.04814v1
Date: Wed, 04 Feb 2026 17:59:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-05 19:45:11.671801
Title: X2HDR: HDR Image Generation in a Perceptually Uniform Space
Title（参考訳）: X2HDR: 知覚的一様空間におけるHDR画像生成
Authors: Ronghuan Wu, Wanchao Su, Kede Ma, Jing Liao, Rafał K. Mantiuk,
Abstract要約: 高ダイナミックレンジフォーマットとディスプレイはますます普及しているが、最先端の画像生成装置は低ダイナミックレンジ(LDR)出力に限定されている。既存の事前学習拡散モデルでは,スクラッチから再学習することなく,HDR生成に容易に適応できることを示す。
参考スコア（独自算出の注目度）: 37.83280929526874
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: High-dynamic-range (HDR) formats and displays are becoming increasingly prevalent, yet state-of-the-art image generators (e.g., Stable Diffusion and FLUX) typically remain limited to low-dynamic-range (LDR) output due to the lack of large-scale HDR training data. In this work, we show that existing pretrained diffusion models can be easily adapted to HDR generation without retraining from scratch. A key challenge is that HDR images are natively represented in linear RGB, whose intensity and color statistics differ substantially from those of sRGB-encoded LDR images. This gap, however, can be effectively bridged by converting HDR inputs into perceptually uniform encodings (e.g., using PU21 or PQ). Empirically, we find that LDR-pretrained variational autoencoders (VAEs) reconstruct PU21-encoded HDR inputs with fidelity comparable to LDR data, whereas linear RGB inputs cause severe degradations. Motivated by this finding, we describe an efficient adaptation strategy that freezes the VAE and finetunes only the denoiser via low-rank adaptation in a perceptually uniform space. This results in a unified computational method that supports both text-to-HDR synthesis and single-image RAW-to-HDR reconstruction. Experiments demonstrate that our perceptually encoded adaptation consistently improves perceptual fidelity, text-image alignment, and effective dynamic range, relative to previous techniques.
Abstract（参考訳）: 高ダイナミックレンジ(HDR)フォーマットやディスプレイはますます普及しているが、大規模なHDRトレーニングデータがないため、一般的には低ダイナミックレンジ(LDR)出力に限定されている。本研究では,既存の事前学習拡散モデルをスクラッチから再学習することなく容易にHDR生成に適応できることを示す。重要な課題は、HDR画像がリニアRGBでネイティブに表現され、その強度と色統計はsRGBエンコードされたLDR画像と大きく異なることである。しかし、このギャップは、HDR入力を知覚的に均一な符号化(PU21やPQ)に変換することで効果的に橋渡しできる。実験により,LDR-pretrained variational autoencoders (VAEs)はPU21符号化HDR入力をLDRデータに匹敵する忠実度で再構成するのに対し,線形RGB入力は深刻な劣化を引き起こすことがわかった。この発見を動機として,視覚的に均一な空間における低ランク適応により,VAEとファインチューンのみを凍結する効率的な適応戦略を述べる。これにより、テキスト・ツー・HDR合成とシングルイメージRAW・ツー・HDR再構成の両方をサポートする統一的な計算手法が実現される。実験により、我々の知覚的に符号化された適応は、従来の手法と比較して知覚の忠実さ、テキスト画像のアライメント、効果的なダイナミックレンジを一貫して改善することが示された。

論文の概要: X2HDR: HDR Image Generation in a Perceptually Uniform Space

関連論文リスト