Fugu-MT 論文翻訳(概要): LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution

論文の概要: LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution

arxiv url: http://arxiv.org/abs/2603.05947v1
Date: Fri, 06 Mar 2026 06:30:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:45.184441
Title: LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution
Title（参考訳）: LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution
Authors: Song Fei, Tian Ye, Sixiang Chen, Zhaohu Xing, Jianyu Lai, Lei Zhu,
Abstract要約: 優先度に基づく強化学習(RL)は、各LR入力が比較対象のロールアウトグループを生成するため、自然な適合である。我々は、フローマッチングリアルタイムISRのためのマルチリワードRLフレームワークLucidNFTを提案する。 LucidNFTはフローベースのReal-ISRベースラインを一貫して改善している。
参考スコア（独自算出の注目度）: 21.290660354883595
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Generative real-world image super-resolution (Real-ISR) can synthesize visually convincing details from severely degraded low-resolution (LR) inputs, yet its stochastic sampling makes a critical failure mode hard to avoid: outputs may look sharp but be unfaithful to the LR evidence (semantic and structural hallucination), while such LR-anchored faithfulness is difficult to assess without HR ground truth. Preference-based reinforcement learning (RL) is a natural fit because each LR input yields a rollout group of candidates to compare. However, effective alignment in Real-ISR is hindered by (i) the lack of a degradation-robust LR-referenced faithfulness signal, and (ii) a rollout-group optimization bottleneck where naive multi-reward scalarization followed by normalization compresses objective-wise contrasts, causing advantage collapse and weakening the reward-weighted updates in DiffusionNFT-style forward fine-tuning. Moreover, (iii) limited coverage of real degradations restricts rollout diversity and preference signal quality. We propose LucidNFT, a multi-reward RL framework for flow-matching Real-ISR. LucidNFT introduces LucidConsistency, a degradation-robust semantic evaluator that makes LR-anchored faithfulness measurable and optimizable; a decoupled advantage normalization strategy that preserves objective-wise contrasts within each LR-conditioned rollout group before fusion, preventing advantage collapse; and LucidLR, a large-scale collection of real-world degraded images to support robust RL fine-tuning. Experiments show that LucidNFT consistently improves strong flow-based Real-ISR baselines, achieving better perceptual-faithfulness trade-offs with stable optimization dynamics across diverse real-world scenarios.
Abstract（参考訳）: 生成現実画像超解像(Real-ISR)は、高度に劣化した低分解能(LR)入力から視覚的に説得力のある詳細を合成することができるが、その確率的サンプリングは、出力はシャープに見えるがLR証拠(意味的および構造的幻覚)には不信である一方、そのようなLRアンコール的忠実さはHRの真実なしに評価することは困難である。優先度に基づく強化学習(RL)は、各LR入力が比較対象のロールアウトグループを生成するため、自然な適合である。しかし、Real-ISRの効果的なアライメントは妨げられる (i)劣化分解性LR関連忠実度信号の欠如、及び (II)DiffusionNFT-style forward fine-tuningにおいて,主観的マルチリワードスカラー化と正規化を伴うロールアウトグループ最適化のボトルネックが客観的コントラストを圧縮し,アドバンストの崩壊を招き,報奨重み付き更新を弱める。さらに三実際の劣化の限られた範囲は、ロールアウトの多様性及び選好信号の品質を制限する。我々は、フローマッチングリアルタイムISRのためのマルチリワードRLフレームワークLucidNFTを提案する。 LucidNFT は、LR-anchored faithfulness を測定可能かつ最適化可能な分解ロバストなセマンティック評価器である LucidConsistency を導入し、融合前に各LR条件付きロールアウトグループ内で客観的にコントラストを保ち、有利な崩壊を防止している LucidLR と、堅牢な RL の微調整をサポートするための大規模な劣化イメージのコレクションである LucidLR を導入した。実験により、LucidNFTはフローベースの強力なReal-ISRベースラインを一貫して改善し、さまざまな現実のシナリオで安定した最適化のダイナミクスで、知覚と信条のトレードオフを改善できることが示されている。

論文の概要: LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-Resolution

関連論文リスト