CNN Injected Transformer for Image Exposure Correction
- URL: http://arxiv.org/abs/2309.04366v1
- Date: Fri, 8 Sep 2023 14:53:00 GMT
- Title: CNN Injected Transformer for Image Exposure Correction
- Authors: Shuning Xu, Xiangyu Chen, Binbin Song, Jiantao Zhou
- Abstract summary: Previous exposure correction methods based on convolutions often produce exposure deviation in images.
We propose a CNN Injected Transformer (CIT) to harness the individual strengths of CNN and Transformer simultaneously.
In addition to the hybrid architecture design for exposure correction, we apply a set of carefully formulated loss functions to improve the spatial coherence and rectify potential color deviations.
- Score: 20.282217209520006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capturing images with incorrect exposure settings fails to deliver a
satisfactory visual experience. Only when the exposure is properly set, can the
color and details of the images be appropriately preserved. Previous exposure
correction methods based on convolutions often produce exposure deviation in
images as a consequence of the restricted receptive field of convolutional
kernels. This issue arises because convolutions are not capable of capturing
long-range dependencies in images accurately. To overcome this challenge, we
can apply the Transformer to address the exposure correction problem,
leveraging its capability in modeling long-range dependencies to capture global
representation. However, solely relying on the window-based Transformer leads
to visually disturbing blocking artifacts due to the application of
self-attention in small patches. In this paper, we propose a CNN Injected
Transformer (CIT) to harness the individual strengths of CNN and Transformer
simultaneously. Specifically, we construct the CIT by utilizing a window-based
Transformer to exploit the long-range interactions among different regions in
the entire image. Within each CIT block, we incorporate a channel attention
block (CAB) and a half-instance normalization block (HINB) to assist the
window-based self-attention to acquire the global statistics and refine local
features. In addition to the hybrid architecture design for exposure
correction, we apply a set of carefully formulated loss functions to improve
the spatial coherence and rectify potential color deviations. Extensive
experiments demonstrate that our image exposure correction method outperforms
state-of-the-art approaches in terms of both quantitative and qualitative
metrics.
Related papers
- Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration [46.96362010335177]
In this paper, we propose HIT, a simple yet effective High-frequency Injected Transformer for image restoration.
Specifically, we design a window-wise injection module (WIM), which incorporates abundant high-frequency details into the feature map, to provide reliable references for restoring high-quality images.
In addition, we introduce a spatial enhancement unit (SEU) to preserve essential spatial relationships that may be lost due to the computations carried out across channel dimensions in the BIM.
arXiv Detail & Related papers (2024-03-30T08:05:00Z) - A Non-Uniform Low-Light Image Enhancement Method with Multi-Scale
Attention Transformer and Luminance Consistency Loss [11.585269110131659]
Low-light image enhancement aims to improve the perception of images collected in dim environments.
Existing methods cannot adaptively extract the differentiated luminance information, which will easily cause over-exposure and under-exposure.
We propose a multi-scale attention Transformer named MSATr, which sufficiently extracts local and global features for light balance to improve the visual quality.
arXiv Detail & Related papers (2023-12-27T10:07:11Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer
for Exposure Correction [65.5397271106534]
A single neural network is difficult to handle all exposure problems.
In particular, convolutions hinder the ability to restore faithful color or details on extremely over-/under- exposed regions.
We propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction.
arXiv Detail & Related papers (2023-09-02T09:07:36Z) - Image Deblurring by Exploring In-depth Properties of Transformer [86.7039249037193]
We leverage deep features extracted from a pretrained vision transformer (ViT) to encourage recovered images to be sharp without sacrificing the performance measured by the quantitative metrics.
By comparing the transformer features between recovered image and target one, the pretrained transformer provides high-resolution blur-sensitive semantic information.
One regards the features as vectors and computes the discrepancy between representations extracted from recovered image and target one in Euclidean space.
arXiv Detail & Related papers (2023-03-24T14:14:25Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Diverse Image Inpainting with Bidirectional and Autoregressive
Transformers [55.21000775547243]
We propose BAT-Fill, an image inpainting framework with a novel bidirectional autoregressive transformer (BAT)
BAT-Fill inherits the merits of transformers and CNNs in a two-stage manner, which allows to generate high-resolution contents without being constrained by the quadratic complexity of attention in transformers.
arXiv Detail & Related papers (2021-04-26T03:52:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.