Fugu-MT 論文翻訳(概要): Retinex Meets Language: A Physics-Semantics-Guided Underwater Image Enhancement Network

論文の概要: Retinex Meets Language: A Physics-Semantics-Guided Underwater Image Enhancement Network

arxiv url: http://arxiv.org/abs/2603.07076v1
Date: Sat, 07 Mar 2026 07:10:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:13.790859
Title: Retinex Meets Language: A Physics-Semantics-Guided Underwater Image Enhancement Network
Title（参考訳）: RetinexがLanguage: 物理セマンティックスによる水中画像強調ネットワーク
Authors: Shixuan Xu, Yabo Liu, Junyu Dong, Xinghui Dong,
Abstract要約: 物理シーマンティックスによる水中画像強調ネットワーク(PSG-UIENet)を提案する。本ネットワークは、プリエントフリーイルミネーションエスタ、クロスモーダルテキストアリグナー、セマンティックスガイド画像復元器を含む。提案したPSG-UIENetは15の最先端手法に対して優れた,あるいは同等の性能を発揮することを示す。
参考スコア（独自算出の注目度）: 44.83389527499136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Underwater images often suffer from severe degradation caused by light absorption and scattering, leading to color distortion, low contrast and reduced visibility. Existing Underwater Image Enhancement (UIE) methods can be divided into two categories, i.e., prior-based and learning-based methods. The former rely on rigid physical assumptions that limit the adaptability, while the latter often face data scarcity and weak generalization. To address these issues, we propose a Physics-Semantics-Guided Underwater Image Enhancement Network (PSG-UIENet), which couples the Retinex-grounded illumination correction with the language-informed guidance. This network comprises a Prior-Free Illumination Estimator, a Cross-Modal Text Aligner and a Semantics-Guided Image Restorer. In particular, the restorer leverages the textual descriptions generated by the Contrastive Language-Image Pre-training (CLIP) model to inject high-level semantics for perceptually meaningful guidance. Since multimodal UIE data sets are not publicly available, we also construct a large-scale image-text UIE data set, namely, LUIQD-TD, which contains 6,418 image-reference-text triplets. To explicitly measure and optimize semantic consistency between textual descriptions and images, we further design an Image-Text Semantic Similarity (ITSS) loss function. To our knowledge, this study makes the first effort to introduce both textual guidance and the multimodal data set into UIE tasks. Extensive experiments on our data set and four publicly available data sets demonstrate that the proposed PSG-UIENet achieves superior or comparable performance against fifteen state-of-the-art methods.
Abstract（参考訳）: 水中画像はしばしば光の吸収と散乱によって引き起こされる深刻な劣化に悩まされ、色歪み、低コントラスト、視界の低下につながる。既存の水中画像強調法(UIE)は,先行的手法と学習的手法の2つのカテゴリに分けられる。前者は適応性を制限する厳密な物理的仮定に依存し、後者はデータ不足と弱い一般化に直面していることが多い。これらの課題に対処するために,Retinex-grounded lightumination correct with the language-informed guidanceを併用したPSG-UIENetを提案する。本ネットワークは、プリエントフリー照明推定器、クロスモーダルテキストアリグナー、セマンティックスガイド画像復元器を備える。特に、コントラスト言語-画像事前学習(CLIP)モデルによって生成されたテキスト記述を利用して、知覚的に意味のあるガイダンスに高レベルの意味論を注入する。マルチモーダルUIEデータセットは公開されていないため、大規模な画像テキストUIEデータセットであるLUIQD-TDも構築する。テキスト記述と画像間の意味的一貫性を明示的に測定し、最適化するために、さらに画像テキスト意味的類似性(ITSS)損失関数を設計する。我々の知る限り、本研究では、テキストガイダンスとマルチモーダルデータセットの両方をUIEタスクに導入する最初の試みである。提案したPSG-UIENetは15の最先端手法に対して優れた,あるいは同等の性能を発揮することを示す。

論文の概要: Retinex Meets Language: A Physics-Semantics-Guided Underwater Image Enhancement Network

関連論文リスト