Related papers: Graph Representation Learning for Infrared and Visible Image Fusion

Graph Representation Learning for Infrared and Visible Image Fusion

URL: http://arxiv.org/abs/2311.00291v1
Date: Wed, 1 Nov 2023 04:46:20 GMT
Title: Graph Representation Learning for Infrared and Visible Image Fusion
Authors: Jing Li, Lu Bai, Bin Yang, Chang Li, Lingfei Ma, and Edwin R. Hancock
Abstract summary: Infrared and visible image fusion aims to extract complementary features to synthesize a single fused image. Many methods employ convolutional neural networks (CNNs) to extract local features. CNNs fail to consider the image's non-local self-similarity (NLss)
Score: 19.756524363404534
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Infrared and visible image fusion aims to extract complementary features to synthesize a single fused image. Many methods employ convolutional neural networks (CNNs) to extract local features due to its translation invariance and locality. However, CNNs fail to consider the image's non-local self-similarity (NLss), though it can expand the receptive field by pooling operations, it still inevitably leads to information loss. In addition, the transformer structure extracts long-range dependence by considering the correlativity among all image patches, leading to information redundancy of such transformer-based methods. However, graph representation is more flexible than grid (CNN) or sequence (transformer structure) representation to address irregular objects, and graph can also construct the relationships among the spatially repeatable details or texture with far-space distance. Therefore, to address the above issues, it is significant to convert images into the graph space and thus adopt graph convolutional networks (GCNs) to extract NLss. This is because the graph can provide a fine structure to aggregate features and propagate information across the nearest vertices without introducing redundant information. Concretely, we implement a cascaded NLss extraction pattern to extract NLss of intra- and inter-modal by exploring interactions of different image pixels in intra- and inter-image positional distance. We commence by preforming GCNs on each intra-modal to aggregate features and propagate information to extract independent intra-modal NLss. Then, GCNs are performed on the concatenate intra-modal NLss features of infrared and visible images, which can explore the cross-domain NLss of inter-modal to reconstruct the fused image. Ablation studies and extensive experiments illustrates the effectiveness and superiority of the proposed method on three datasets.

Related papers

Image Segmentation: Inducing graph-based learning [4.499833362998488]
This study explores the potential of graph neural networks (GNNs) to enhance semantic segmentation across diverse image modalities. GNNs explicitly model relationships between image regions by constructing and operating on a graph representation of the image features. Our analysis demonstrates the versatility of GNNs in addressing diverse segmentation challenges and highlights their potential to improve segmentation accuracy in various applications.
arXiv Detail & Related papers (2025-01-07T13:09:44Z)
ASCNet: Asymmetric Sampling Correction Network for Infrared Image Destriping [26.460122241870696]
We propose a novel infrared image destriping method called Asymmetric Sampling Correction Network (ASCNet) Our ASCNet consists of three core elements: Residual Haar Discrete Wavelet Transform (RHDWT), Pixel Shuffle (PS), and Column Non-uniformity Correction Module (CNCM)
arXiv Detail & Related papers (2024-01-28T06:23:55Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs. Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z)
Learning a Graph Neural Network with Cross Modality Interaction for Image Fusion [23.296468921842948]
Infrared and visible image fusion has gradually proved to be a vital fork in the field of multi-modality imaging technologies. We propose an interactive graph neural network (GNN)-based architecture between cross modality for fusion, called IGNet. Our IGNet can generate visually appealing fused images while scoring averagely 2.59% mAP@.5 and 7.77% mIoU higher in detection and segmentation.
arXiv Detail & Related papers (2023-08-07T02:25:06Z)
Multi-view Graph Convolutional Networks with Differentiable Node Selection [29.575611350389444]
We propose a framework dubbed Multi-view Graph Convolutional Network with Differentiable Node Selection (MGCN-DNS) MGCN-DNS accepts multi-channel graph-structural data as inputs and aims to learn more robust graph fusion through a differentiable neural network. The effectiveness of the proposed method is verified by rigorous comparisons with considerable state-of-the-art approaches.
arXiv Detail & Related papers (2022-12-09T21:48:36Z)
Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes. Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z)
Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z)
An attention-driven hierarchical multi-scale representation for visual recognition [3.3302293148249125]
Convolutional Neural Networks (CNNs) have revolutionized the understanding of visual content. We propose a method to capture high-level long-range dependencies by exploring Graph Convolutional Networks (GCNs) Our approach is simple yet extremely effective in solving both the fine-grained and generic visual classification problems.
arXiv Detail & Related papers (2021-10-23T09:22:22Z)
RSI-Net: Two-Stream Deep Neural Network Integrating GCN and Atrous CNN for Semantic Segmentation of High-resolution Remote Sensing Images [3.468780866037609]
Two-stream deep neural network for semantic segmentation of remote sensing images (RSI-Net) is proposed in this paper. Experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets. Results demonstrate the superior performance of RSI-Net in terms of overall accuracy, F1 score and kappa coefficient when compared with six state-of-the-art RSI semantic segmentation methods.
arXiv Detail & Related papers (2021-09-19T15:57:20Z)
Spectral-Spatial Global Graph Reasoning for Hyperspectral Image Classification [50.899576891296235]
Convolutional neural networks have been widely applied to hyperspectral image classification. Recent methods attempt to address this issue by performing graph convolutions on spatial topologies.
arXiv Detail & Related papers (2021-06-26T06:24:51Z)
Multi-Level Graph Convolutional Network with Automatic Graph Learning for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification. By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions. Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.