Fugu-MT 論文翻訳(概要): EvaNet: Towards More Efficient and Consistent Infrared and Visible Image Fusion Assessment

論文の概要: EvaNet: Towards More Efficient and Consistent Infrared and Visible Image Fusion Assessment

arxiv url: http://arxiv.org/abs/2604.02896v1
Date: Fri, 03 Apr 2026 09:12:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 17:20:24.422438
Title: EvaNet: Towards More Efficient and Consistent Infrared and Visible Image Fusion Assessment
Title（参考訳）: EvaNet: より効率的で一貫性のある赤外線と可視画像の融合評価を目指す
Authors: Chunyang Cheng, Tianyang Xu, Xiao-Jun Wu, Tao Zhou, Hui Li, Zhangyong Tang, Josef Kittler,
Abstract要約: 画像融合研究において評価は不可欠であるが、既存の指標のほとんどは、適切な適応なしに他の視覚タスクから直接借用されている。画像融合に適した統合評価フレームワークを提案する。我々の学習に基づく評価パラダイムは、様々な標準画像融合ベンチマークにおいて、優れた効率(最大1000倍高速)とより優れた一貫性を提供する。
参考スコア（独自算出の注目度）: 63.853717062482815
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Evaluation is essential in image fusion research, yet most existing metrics are directly borrowed from other vision tasks without proper adaptation. These traditional metrics, often based on complex image transformations, not only fail to capture the true quality of the fusion results but also are computationally demanding. To address these issues, we propose a unified evaluation framework specifically tailored for image fusion. At its core is a lightweight network designed efficiently to approximate widely used metrics, following a divide-and-conquer strategy. Unlike conventional approaches that directly assess similarity between fused and source images, we first decompose the fusion result into infrared and visible components. The evaluation model is then used to measure the degree of information preservation in these separated components, effectively disentangling the fusion evaluation process. During training, we incorporate a contrastive learning strategy and inform our evaluation model by perceptual scene assessment provided by a large language model. Last, we propose the first consistency evaluation framework, which measures the alignment between image fusion metrics and human visual perception, using both independent no-reference scores and downstream tasks performance as objective references. Extensive experiments show that our learning-based evaluation paradigm delivers both superior efficiency (up to 1,000 times faster) and greater consistency across a range of standard image fusion benchmarks. Our code will be publicly available at https://github.com/AWCXV/EvaNet.
Abstract（参考訳）: 画像融合研究において評価は不可欠であるが、既存の指標のほとんどは、適切な適応なしに他の視覚タスクから直接借用されている。これらの伝統的なメトリクスは、しばしば複雑な画像変換に基づいており、融合結果の真の品質を捉えることに失敗するだけでなく、計算的に要求される。これらの課題に対処するため,画像融合に適した統合評価フレームワークを提案する。コアとなるのは、ディバイド・アンド・コンカの戦略に従って、広く使用されているメトリクスを効率的に近似するように設計された軽量ネットワークである。融合画像とソース画像の類似性を直接評価する従来の手法とは異なり、まず融合結果を赤外線と可視成分に分解する。そして、これらの分離されたコンポーネントにおける情報保存の度合いを測定するために評価モデルを使用し、融合評価プロセスを効果的に切り離す。学習中は、対照的な学習戦略を取り入れ、大きな言語モデルによって提供される知覚的シーンアセスメントによって評価モデルに通知する。最後に,画像融合計測値と人間の視覚知覚との整合性を測定する第1の整合性評価フレームワークを提案する。大規模な実験により、我々の学習に基づく評価パラダイムは、様々な標準画像融合ベンチマークにおいて、より優れた効率(最大1000倍高速)とより優れた一貫性をもたらすことが示された。私たちのコードはhttps://github.com/AWCXV/EvaNet.comで公開されます。

論文の概要: EvaNet: Towards More Efficient and Consistent Infrared and Visible Image Fusion Assessment

関連論文リスト