Rethinking the Evaluation of Visible and Infrared Image Fusion
- URL: http://arxiv.org/abs/2410.06811v1
- Date: Wed, 9 Oct 2024 12:12:08 GMT
- Title: Rethinking the Evaluation of Visible and Infrared Image Fusion
- Authors: Dayan Guan, Yixuan Wu, Tianzhu Liu, Alex C. Kot, Yanfeng Gu,
- Abstract summary: Visible and Infrared Image Fusion (VIF) has garnered significant interest across a wide range of high-level vision tasks.
This paper proposes a semantic-oriented Evaluation Approach (SEA) to assess VIF methods.
- Score: 39.53356881392218
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Visible and Infrared Image Fusion (VIF) has garnered significant interest across a wide range of high-level vision tasks, such as object detection and semantic segmentation. However, the evaluation of VIF methods remains challenging due to the absence of ground truth. This paper proposes a Segmentation-oriented Evaluation Approach (SEA) to assess VIF methods by incorporating the semantic segmentation task and leveraging segmentation labels available in latest VIF datasets. Specifically, SEA utilizes universal segmentation models, capable of handling diverse images and classes, to predict segmentation outputs from fused images and compare these outputs with segmentation labels. Our evaluation of recent VIF methods using SEA reveals that their performance is comparable or even inferior to using visible images only, despite nearly half of the infrared images demonstrating better performance than visible images. Further analysis indicates that the two metrics most correlated to our SEA are the gradient-based fusion metric $Q_{\text{ABF}}$ and the visual information fidelity metric $Q_{\text{VIFF}}$ in conventional VIF evaluation metrics, which can serve as proxies when segmentation labels are unavailable. We hope that our evaluation will guide the development of novel and practical VIF methods. The code has been released in \url{https://github.com/Yixuan-2002/SEA/}.
Related papers
- Leveraging Out-of-Distribution Unlabeled Images: Semi-Supervised Semantic Segmentation with an Open-Vocabulary Model [5.4808525950169695]
In real-world scenarios, abundant unlabeled images are often available from online sources or large-scale datasets.<n>Using these images as unlabeled data in semi-supervised learning can lead to inaccurate pseudo-labels.<n>We propose a new semi-supervised semantic segmentation framework with an open-vocabulary segmentation model (SemiOVS) to effectively utilize unlabeled OOD images.
arXiv Detail & Related papers (2025-07-04T05:12:37Z) - Cross-Spectral Body Recognition with Side Information Embedding: Benchmarks on LLCM and Analyzing Range-Induced Occlusions on IJB-MDF [51.36007967653781]
Vision Transformers (ViTs) have demonstrated impressive performance across a wide range of biometric tasks, including face and body recognition.<n>In this work, we adapt a ViT model pretrained on visible (VIS) imagery to the challenging problem of cross-spectral body recognition.<n>Building on this idea, we integrate Side Information Embedding (SIE) and examine the impact of encoding domain and camera information to enhance cross-spectral matching.<n>Surprisingly, our results show that encoding only camera information - without explicitly incorporating domain information - achieves state-of-the-art performance on the LLCM dataset.
arXiv Detail & Related papers (2025-06-10T16:20:52Z) - Intuitionistic Fuzzy Cognitive Maps for Interpretable Image Classification [2.130156029408832]
This paper introduces a novel framework, named Interpretable Intuitionistic FCM (I2FCM) which is domain-independent, simple to implement, and can be applied on CNN models.
To the best of our knowledge this is the first time iFCMs are applied for image classification.
arXiv Detail & Related papers (2024-08-07T12:58:39Z) - CrossScore: Towards Multi-View Image Evaluation and Scoring [24.853612457257697]
Cross-reference image quality assessment method fills the gap in the image assessment landscape.
Our method enables accurate image quality assessment without requiring ground truth references.
arXiv Detail & Related papers (2024-04-22T17:59:36Z) - SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks.
To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z) - Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images [3.063197102484114]
We propose an approach to estimate whether a channel augmentation technique affects the physical information of RS images.
We compare the scores associated with original and augmented pixel signatures to evaluate the physical consistency.
arXiv Detail & Related papers (2024-03-21T16:48:45Z) - Enhancing Point Annotations with Superpixel and Confidence Learning
Guided for Improving Semi-Supervised OCT Fluid Segmentation [17.85298271262749]
Superpixel and Confident Learning Guide Point s Network (SCLGPA-Net) based on the teacher-student architecture.
Superpixel-Guided Pseudo-Label Generation (SGPLG) module generates pseudo-labels and pixel-level label trust maps.
Confident Learning Guided Label Refinement (CLGLR) module identifies error information in the pseudo-labels and leads to further refinement.
arXiv Detail & Related papers (2023-06-05T04:21:00Z) - Reflection Invariance Learning for Few-shot Semantic Segmentation [53.20466630330429]
Few-shot semantic segmentation (FSS) aims to segment objects of unseen classes in query images with only a few annotated support images.
This paper proposes a fresh few-shot segmentation framework to mine the reflection invariance in a multi-view matching manner.
Experiments on both PASCAL-$5textiti$ and COCO-$20textiti$ datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-01T15:14:58Z) - Inverse Image Frequency for Long-tailed Image Recognition [59.40098825416675]
We propose a novel de-biasing method named Inverse Image Frequency (IIF)
IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network.
Our experiments show that IIF surpasses the state of the art on many long-tailed benchmarks.
arXiv Detail & Related papers (2022-09-11T13:31:43Z) - A-FMI: Learning Attributions from Deep Networks via Feature Map
Importance [58.708607977437794]
Gradient-based attribution methods can aid in the understanding of convolutional neural networks (CNNs)
The redundancy of attribution features and the gradient saturation problem are challenges that attribution methods still face.
We propose a new concept, feature map importance (FMI), to refine the contribution of each feature map, and a novel attribution method via FMI, to address the gradient saturation problem.
arXiv Detail & Related papers (2021-04-12T14:54:44Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z) - CryoNuSeg: A Dataset for Nuclei Instance Segmentation of Cryosectioned
H&E-Stained Histological Images [2.809445852388983]
We introduce CryoNuSeg, the first fully annotated FS-derived cryosectioned and H&E-stained nuclei instance segmentation dataset.
The dataset contains images from 10 human organs that were not exploited in other publicly available datasets.
arXiv Detail & Related papers (2021-01-02T12:34:06Z) - Learning Discriminative Feature with CRF for Unsupervised Video Object
Segmentation [34.1031534327244]
We introduce discriminative feature network (DFNet) to address the unsupervised video object segmentation task.
DFNet outperforms state-of-the-art methods by a large margin with a mean IoU score of 83.4%.
DFNet is also applied to the image object co-segmentation task.
arXiv Detail & Related papers (2020-08-04T01:53:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.