Fugu-MT 論文翻訳(概要): DP$^2$O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution

論文の概要: DP$^2$O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution

arxiv url: http://arxiv.org/abs/2510.18851v1
Date: Tue, 21 Oct 2025 17:43:23 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:14.045386
Title: DP$^2$O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution
Title（参考訳）: DP$^2$O-SR:実世界の超解像に対する直接知覚的選好最適化
Authors: Rongyuan Wu, Lingchen Sun, Zhengqiang Zhang, Shihao Wang, Tianhe Wu, Qiaosi Yi, Shuai Li, Lei Zhang,
Abstract要約: コストのかかる人的アノテーションを必要とせずに、生成モデルと知覚的嗜好を整合させるフレームワークを導入する。 DP$2$O-SRは知覚品質を著しく改善し,実世界のベンチマークによく適合することを示す。
参考スコア（独自算出の注目度）: 31.6824458800392
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Benefiting from pre-trained text-to-image (T2I) diffusion models, real-world image super-resolution (Real-ISR) methods can synthesize rich and realistic details. However, due to the inherent stochasticity of T2I models, different noise inputs often lead to outputs with varying perceptual quality. Although this randomness is sometimes seen as a limitation, it also introduces a wider perceptual quality range, which can be exploited to improve Real-ISR performance. To this end, we introduce Direct Perceptual Preference Optimization for Real-ISR (DP$^2$O-SR), a framework that aligns generative models with perceptual preferences without requiring costly human annotations. We construct a hybrid reward signal by combining full-reference and no-reference image quality assessment (IQA) models trained on large-scale human preference datasets. This reward encourages both structural fidelity and natural appearance. To better utilize perceptual diversity, we move beyond the standard best-vs-worst selection and construct multiple preference pairs from outputs of the same model. Our analysis reveals that the optimal selection ratio depends on model capacity: smaller models benefit from broader coverage, while larger models respond better to stronger contrast in supervision. Furthermore, we propose hierarchical preference optimization, which adaptively weights training pairs based on intra-group reward gaps and inter-group diversity, enabling more efficient and stable learning. Extensive experiments across both diffusion- and flow-based T2I backbones demonstrate that DP$^2$O-SR significantly improves perceptual quality and generalizes well to real-world benchmarks.
Abstract（参考訳）: 事前訓練されたテキスト・ツー・イメージ(T2I)拡散モデルと実世界の画像超解像法(Real-ISR)により、リッチでリアルな詳細を合成することができる。しかし、T2Iモデル固有の確率性のため、異なるノイズ入力は知覚品質の異なる出力につながることが多い。このランダム性は時として制限と見なされるが、より広い知覚品質範囲を導入し、リアルISRの性能を向上させるために利用することができる。この目的のために我々は,生成モデルと知覚的嗜好を協調するフレームワークであるReal-ISR (DP$^2$O-SR) の直接パーセプチュアルな選好最適化(Direct Perceptual Preference Optimization for Real-ISR)を導入する。我々は,大規模人間の嗜好データセットに基づいて学習した全参照画像品質評価(IQA)モデルと非参照画像品質評価(IQA)モデルを組み合わせたハイブリッド報酬信号を構築した。この報酬は、構造的忠実さと自然な外観の両方を促進する。知覚の多様性をよりよく活用するために、我々は標準のベストvs-ワースト選択を超えて、同じモデルの出力から複数の選好ペアを構築する。我々の分析によれば、最適選択比はモデルキャパシティに依存し、より小さなモデルはより広範なカバレッジの恩恵を受ける一方、より大きなモデルは監督においてより強いコントラストに反応する。さらに、グループ内報酬ギャップとグループ間多様性に基づいてトレーニングペアを適応的に重み付けし、より効率的で安定した学習を可能にする階層的選好最適化を提案する。拡散および流動に基づくT2Iバックボーンの広範な実験により、DP$^2$O-SRは知覚品質を著しく改善し、実世界のベンチマークによく適合することを示した。

論文の概要: DP$^2$O-SR: Direct Perceptual Preference Optimization for Real-World Image Super-Resolution

関連論文リスト