Fugu-MT 論文翻訳(概要): Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws

論文の概要: Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws

arxiv url: http://arxiv.org/abs/2510.26268v1
Date: Thu, 30 Oct 2025 08:53:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-31 16:05:09.716535
Title: Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws
Title（参考訳）: 人間の認知法則に基づく生成的赤外・可視画像融合の再検討
Authors: Lin Guo, Xiaoqing Luo, Wei Xie, Zhancheng Zhang, Hui Li, Rui Wang, Zhenhua Feng, Xiaoning Song,
Abstract要約: 既存の赤外線と可視画像の融合法は、しばしばモーダル情報のバランスをとるジレンマに直面している。この写本は、人間の認知法則のインスピレーションのもと、生成的画像融合の本質を再考する。 HCLFuseと呼ばれる新しい赤外線可視画像融合法を提案する。
参考スコア（独自算出の注目度）: 25.147481162887136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing infrared and visible image fusion methods often face the dilemma of balancing modal information. Generative fusion methods reconstruct fused images by learning from data distributions, but their generative capabilities remain limited. Moreover, the lack of interpretability in modal information selection further affects the reliability and consistency of fusion results in complex scenarios. This manuscript revisits the essence of generative image fusion under the inspiration of human cognitive laws and proposes a novel infrared and visible image fusion method, termed HCLFuse. First, HCLFuse investigates the quantification theory of information mapping in unsupervised fusion networks, which leads to the design of a multi-scale mask-regulated variational bottleneck encoder. This encoder applies posterior probability modeling and information decomposition to extract accurate and concise low-level modal information, thereby supporting the generation of high-fidelity structural details. Furthermore, the probabilistic generative capability of the diffusion model is integrated with physical laws, forming a time-varying physical guidance mechanism that adaptively regulates the generation process at different stages, thereby enhancing the ability of the model to perceive the intrinsic structure of data and reducing dependence on data quality. Experimental results show that the proposed method achieves state-of-the-art fusion performance in qualitative and quantitative evaluations across multiple datasets and significantly improves semantic segmentation metrics. This fully demonstrates the advantages of this generative image fusion method, drawing inspiration from human cognition, in enhancing structural consistency and detail quality.
Abstract（参考訳）: 既存の赤外線と可視画像の融合法は、しばしばモーダル情報のバランスをとるジレンマに直面している。生成融合法はデータ分布から学習することで融合画像を再構成するが、生成能力は限られている。さらに、モーダル情報選択における解釈可能性の欠如は、複雑なシナリオにおける融合結果の信頼性と一貫性にさらに影響を及ぼす。ヒトの認知法則のインスピレーションのもと、生成画像融合の本質を再考し、新しい赤外可視画像融合法、HCLFuseを提案する。まず、HCLFuseは、教師なし核融合ネットワークにおける情報マッピングの定量化理論を調査し、マルチスケールマスク制御変動ボトルネックエンコーダの設計に繋がる。このエンコーダは、後続確率モデリングと情報分解を適用して、高精度で高精度な低レベルモーダル情報を抽出し、高忠実度構造情報の生成を支援する。さらに、拡散モデルの確率的生成能力は物理法則と統合され、異なる段階における生成過程を適応的に規制する時間変化物理的誘導機構を形成し、これにより、データ固有の構造を知覚し、データ品質への依存を低減するモデルの能力を高める。実験結果から,提案手法は複数のデータセットの質的,定量的な評価において最先端の融合性能を実現し,セマンティックセグメンテーションの指標を大幅に改善することが示された。このことは、人間の認識からインスピレーションを得て、構造的整合性と詳細な品質を向上させる、この生成画像融合法の利点を十分に示している。

論文の概要: Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws

関連論文リスト