Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation
- URL: http://arxiv.org/abs/2307.08536v1
- Date: Mon, 17 Jul 2023 14:53:09 GMT
- Title: Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation
- Authors: Baihong Lin, Zengrong Lin, Yulan Guo, Yulan Zhang, Jianxiao Zou,
Shicai Fan
- Abstract summary: RGB-T semantic segmentation has been widely adopted to handle hard scenes with poor lighting conditions.
Existing methods try to find an optimal fusion feature for segmentation, resulting in sensitivity to modality noise, class-imbalance, and modality bias.
This paper proposes a novel Variational Probabilistic Fusion Network (VPFNet), which regards fusion features as random variables.
- Score: 22.977168376864494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: RGB-T semantic segmentation has been widely adopted to handle hard scenes
with poor lighting conditions by fusing different modality features of RGB and
thermal images. Existing methods try to find an optimal fusion feature for
segmentation, resulting in sensitivity to modality noise, class-imbalance, and
modality bias. To overcome the problems, this paper proposes a novel
Variational Probabilistic Fusion Network (VPFNet), which regards fusion
features as random variables and obtains robust segmentation by averaging
segmentation results under multiple samples of fusion features. The random
samples generation of fusion features in VPFNet is realized by a novel
Variational Feature Fusion Module (VFFM) designed based on variation attention.
To further avoid class-imbalance and modality bias, we employ the weighted
cross-entropy loss and introduce prior information of illumination and category
to control the proposed VFFM. Experimental results on MFNet and PST900 datasets
demonstrate that the proposed VPFNet can achieve state-of-the-art segmentation
performance.
Related papers
- Modality Prompts for Arbitrary Modality Salient Object Detection [57.610000247519196]
This paper delves into the task of arbitrary modality salient object detection (AM SOD)
It aims to detect salient objects from arbitrary modalities, eg RGB images, RGB-D images, and RGB-D-T images.
A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD.
arXiv Detail & Related papers (2024-05-06T11:02:02Z) - Residual Spatial Fusion Network for RGB-Thermal Semantic Segmentation [19.41334573257174]
Traditional methods mostly use RGB images which are heavily affected by lighting conditions, eg, darkness.
Recent studies show thermal images are robust to the night scenario as a compensating modality for segmentation.
This work proposes a Residual Spatial Fusion Network (RSFNet) for RGB-T semantic segmentation.
arXiv Detail & Related papers (2023-06-17T14:28:08Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion [144.9653045465908]
We propose a novel fusion algorithm based on the denoising diffusion probabilistic model (DDPM)
Our approach yields promising fusion results in infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2023-03-13T04:06:42Z) - CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for
Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network.
We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z) - Mirror Complementary Transformer Network for RGB-thermal Salient Object
Detection [16.64781797503128]
RGB-thermal object detection (RGB-T SOD) aims to locate the common prominent objects of an aligned visible and thermal infrared image pair.
In this paper, we propose a novel mirror complementary Transformer network (MCNet) for RGB-T SOD.
Experiments on benchmark and VT723 datasets show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-07T20:26:09Z) - Transformer-based Network for RGB-D Saliency Detection [82.6665619584628]
Key to RGB-D saliency detection is to fully mine and fuse information at multiple scales across the two modalities.
We show that transformer is a uniform operation which presents great efficacy in both feature fusion and feature enhancement.
Our proposed network performs favorably against state-of-the-art RGB-D saliency detection methods.
arXiv Detail & Related papers (2021-12-01T15:53:58Z) - Anomaly Detection of Defect using Energy of Point Pattern Features
within Random Finite Set Framework [5.7564383437854625]
We propose an efficient approach for industrial defect detection that is modeled based on anomaly detection using point pattern data.
We are the first to propose using transfer learning of local/point pattern features to overcome these limitations.
We evaluate the proposed approach on the MVTec AD dataset.
arXiv Detail & Related papers (2021-08-27T08:06:37Z) - Learning Selective Mutual Attention and Contrast for RGB-D Saliency
Detection [145.4919781325014]
How to effectively fuse cross-modal information is the key problem for RGB-D salient object detection.
Many models use the feature fusion strategy but are limited by the low-order point-to-point fusion methods.
We propose a novel mutual attention model by fusing attention and contexts from different modalities.
arXiv Detail & Related papers (2020-10-12T08:50:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.