Graph Representation Learning for Infrared and Visible Image Fusion
- URL: http://arxiv.org/abs/2311.00291v1
- Date: Wed, 1 Nov 2023 04:46:20 GMT
- Title: Graph Representation Learning for Infrared and Visible Image Fusion
- Authors: Jing Li, Lu Bai, Bin Yang, Chang Li, Lingfei Ma, and Edwin R. Hancock
- Abstract summary: Infrared and visible image fusion aims to extract complementary features to synthesize a single fused image.
Many methods employ convolutional neural networks (CNNs) to extract local features.
CNNs fail to consider the image's non-local self-similarity (NLss)
- Score: 19.756524363404534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Infrared and visible image fusion aims to extract complementary features to
synthesize a single fused image. Many methods employ convolutional neural
networks (CNNs) to extract local features due to its translation invariance and
locality. However, CNNs fail to consider the image's non-local self-similarity
(NLss), though it can expand the receptive field by pooling operations, it
still inevitably leads to information loss. In addition, the transformer
structure extracts long-range dependence by considering the correlativity among
all image patches, leading to information redundancy of such transformer-based
methods. However, graph representation is more flexible than grid (CNN) or
sequence (transformer structure) representation to address irregular objects,
and graph can also construct the relationships among the spatially repeatable
details or texture with far-space distance. Therefore, to address the above
issues, it is significant to convert images into the graph space and thus adopt
graph convolutional networks (GCNs) to extract NLss. This is because the graph
can provide a fine structure to aggregate features and propagate information
across the nearest vertices without introducing redundant information.
Concretely, we implement a cascaded NLss extraction pattern to extract NLss of
intra- and inter-modal by exploring interactions of different image pixels in
intra- and inter-image positional distance. We commence by preforming GCNs on
each intra-modal to aggregate features and propagate information to extract
independent intra-modal NLss. Then, GCNs are performed on the concatenate
intra-modal NLss features of infrared and visible images, which can explore the
cross-domain NLss of inter-modal to reconstruct the fused image. Ablation
studies and extensive experiments illustrates the effectiveness and superiority
of the proposed method on three datasets.
Related papers
- ASCNet: Asymmetric Sampling Correction Network for Infrared Image Destriping [26.460122241870696]
We propose a novel infrared image destriping method called Asymmetric Sampling Correction Network (ASCNet)
Our ASCNet consists of three core elements: Residual Haar Discrete Wavelet Transform (RHDWT), Pixel Shuffle (PS), and Column Non-uniformity Correction Module (CNCM)
arXiv Detail & Related papers (2024-01-28T06:23:55Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Learning a Graph Neural Network with Cross Modality Interaction for
Image Fusion [23.296468921842948]
Infrared and visible image fusion has gradually proved to be a vital fork in the field of multi-modality imaging technologies.
We propose an interactive graph neural network (GNN)-based architecture between cross modality for fusion, called IGNet.
Our IGNet can generate visually appealing fused images while scoring averagely 2.59% mAP@.5 and 7.77% mIoU higher in detection and segmentation.
arXiv Detail & Related papers (2023-08-07T02:25:06Z) - Multi-view Graph Convolutional Networks with Differentiable Node
Selection [29.575611350389444]
We propose a framework dubbed Multi-view Graph Convolutional Network with Differentiable Node Selection (MGCN-DNS)
MGCN-DNS accepts multi-channel graph-structural data as inputs and aims to learn more robust graph fusion through a differentiable neural network.
The effectiveness of the proposed method is verified by rigorous comparisons with considerable state-of-the-art approaches.
arXiv Detail & Related papers (2022-12-09T21:48:36Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms.
In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches.
Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z) - An attention-driven hierarchical multi-scale representation for visual
recognition [3.3302293148249125]
Convolutional Neural Networks (CNNs) have revolutionized the understanding of visual content.
We propose a method to capture high-level long-range dependencies by exploring Graph Convolutional Networks (GCNs)
Our approach is simple yet extremely effective in solving both the fine-grained and generic visual classification problems.
arXiv Detail & Related papers (2021-10-23T09:22:22Z) - RSI-Net: Two-Stream Deep Neural Network Integrating GCN and Atrous CNN
for Semantic Segmentation of High-resolution Remote Sensing Images [3.468780866037609]
Two-stream deep neural network for semantic segmentation of remote sensing images (RSI-Net) is proposed in this paper.
Experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets.
Results demonstrate the superior performance of RSI-Net in terms of overall accuracy, F1 score and kappa coefficient when compared with six state-of-the-art RSI semantic segmentation methods.
arXiv Detail & Related papers (2021-09-19T15:57:20Z) - Spectral-Spatial Global Graph Reasoning for Hyperspectral Image
Classification [50.899576891296235]
Convolutional neural networks have been widely applied to hyperspectral image classification.
Recent methods attempt to address this issue by performing graph convolutions on spatial topologies.
arXiv Detail & Related papers (2021-06-26T06:24:51Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.