TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework
using Self-Supervised Multi-Task Learning
- URL: http://arxiv.org/abs/2112.01030v1
- Date: Thu, 2 Dec 2021 07:43:42 GMT
- Title: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework
using Self-Supervised Multi-Task Learning
- Authors: Linhao Qu, Shaolei Liu, Manning Wang, Zhijian Song
- Abstract summary: We propose TransMEF, a transformer-based multi-exposure image fusion framework.
The framework is based on an encoder-decoder network, which can be trained on large natural image datasets.
- Score: 5.926203312586108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose TransMEF, a transformer-based multi-exposure image
fusion framework that uses self-supervised multi-task learning. The framework
is based on an encoder-decoder network, which can be trained on large natural
image datasets and does not require ground truth fusion images. We design three
self-supervised reconstruction tasks according to the characteristics of
multi-exposure images and conduct these tasks simultaneously using multi-task
learning; through this process, the network can learn the characteristics of
multi-exposure images and extract more generalized features. In addition, to
compensate for the defect in establishing long-range dependencies in CNN-based
architectures, we design an encoder that combines a CNN module with a
transformer module. This combination enables the network to focus on both local
and global information. We evaluated our method and compared it to 11
competitive traditional and deep learning-based methods on the latest released
multi-exposure image fusion benchmark dataset, and our method achieved the best
performance in both subjective and objective evaluations.
Related papers
- Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond [74.96466744512992]
The essence of image fusion is to integrate complementary information from source images.
DeFusion++ produces versatile fused representations that can enhance the quality of image fusion and the effectiveness of downstream high-level vision tasks.
arXiv Detail & Related papers (2024-10-16T06:28:49Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation
Models Through Prompt Tuning [35.39822183728463]
We present a novel Prompt-IML framework for detecting tampered images.
Humans tend to discern authenticity of an image based on semantic and high-frequency information.
Our model can achieve better performance on eight typical fake image datasets.
arXiv Detail & Related papers (2024-01-01T03:45:07Z) - CtxMIM: Context-Enhanced Masked Image Modeling for Remote Sensing Image Understanding [38.53988682814626]
We propose a context-enhanced masked image modeling method (CtxMIM) for remote sensing image understanding.
CtxMIM formulates original image patches as a reconstructive template and employs a Siamese framework to operate on two sets of image patches.
With the simple and elegant design, CtxMIM encourages the pre-training model to learn object-level or pixel-level features on a large-scale dataset.
arXiv Detail & Related papers (2023-09-28T18:04:43Z) - Equivariant Multi-Modality Image Fusion [124.11300001864579]
We propose the Equivariant Multi-Modality imAge fusion paradigm for end-to-end self-supervised learning.
Our approach is rooted in the prior knowledge that natural imaging responses are equivariant to certain transformations.
Experiments confirm that EMMA yields high-quality fusion results for infrared-visible and medical images.
arXiv Detail & Related papers (2023-05-19T05:50:24Z) - Multimodal Image Fusion based on Hybrid CNN-Transformer and Non-local
Cross-modal Attention [12.167049432063132]
We present a hybrid model consisting of a convolutional encoder and a Transformer-based decoder to fuse multimodal images.
A branch fusion module is designed to adaptively fuse the features of the two branches.
arXiv Detail & Related papers (2022-10-18T13:30:52Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - TransFuse: A Unified Transformer-based Image Fusion Framework using
Self-supervised Learning [5.849513679510834]
Image fusion is a technique to integrate information from multiple source images with complementary information to improve the richness of a single image.
Two-stage methods avoid the need of large amount of task-specific training data by training encoder-decoder network on large natural image datasets.
We propose a destruction-reconstruction based self-supervised training scheme to encourage the network to learn task-specific features.
arXiv Detail & Related papers (2022-01-19T07:30:44Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.