SGDFormer: One-stage Transformer-based Architecture for Cross-Spectral Stereo Image Guided Denoising
- URL: http://arxiv.org/abs/2404.00349v1
- Date: Sat, 30 Mar 2024 12:55:19 GMT
- Title: SGDFormer: One-stage Transformer-based Architecture for Cross-Spectral Stereo Image Guided Denoising
- Authors: Runmin Zhang, Zhu Yu, Zehua Sheng, Jiacheng Ying, Si-Yuan Cao, Shu-Jie Chen, Bailin Yang, Junwei Li, Hui-Liang Shen,
- Abstract summary: We propose a one-stage transformer-based architecture, named SGDFormer, for cross-spectral Stereo image Guided Denoising.
Our transformer block contains a noise-robust cross-attention (NRCA) module and a spatially variant feature fusion (SVFF) module.
Thanks to the above design, our SGDFormer can restore artifact-free images with fine structures, and achieves state-of-the-art performance on various datasets.
- Score: 11.776198596143931
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-spectral image guided denoising has shown its great potential in recovering clean images with rich details, such as using the near-infrared image to guide the denoising process of the visible one. To obtain such image pairs, a feasible and economical way is to employ a stereo system, which is widely used on mobile devices. Current works attempt to generate an aligned guidance image to handle the disparity between two images. However, due to occlusion, spectral differences and noise degradation, the aligned guidance image generally exists ghosting and artifacts, leading to an unsatisfactory denoised result. To address this issue, we propose a one-stage transformer-based architecture, named SGDFormer, for cross-spectral Stereo image Guided Denoising. The architecture integrates the correspondence modeling and feature fusion of stereo images into a unified network. Our transformer block contains a noise-robust cross-attention (NRCA) module and a spatially variant feature fusion (SVFF) module. The NRCA module captures the long-range correspondence of two images in a coarse-to-fine manner to alleviate the interference of noise. The SVFF module further enhances salient structures and suppresses harmful artifacts through dynamically selecting useful information. Thanks to the above design, our SGDFormer can restore artifact-free images with fine structures, and achieves state-of-the-art performance on various datasets. Additionally, our SGDFormer can be extended to handle other unaligned cross-model guided restoration tasks such as guided depth super-resolution.
Related papers
- Segmentation Guided Sparse Transformer for Under-Display Camera Image
Restoration [91.65248635837145]
Under-Display Camera (UDC) is an emerging technology that achieves full-screen display via hiding the camera under the display panel.
In this paper, we observe that when using the Vision Transformer for UDC degraded image restoration, the global attention samples a large amount of redundant information and noise.
We propose a Guided Sparse Transformer method (SGSFormer) for the task of restoring high-quality images from UDC degraded images.
arXiv Detail & Related papers (2024-03-09T13:11:59Z) - DestripeCycleGAN: Stripe Simulation CycleGAN for Unsupervised Infrared
Image Destriping [15.797480466799222]
CycleGAN has been proven to be an advanced approach for unsupervised image restoration.
We present a novel framework for single-frame infrared image destriping, named DestripeCycleGAN.
arXiv Detail & Related papers (2024-02-14T11:22:20Z) - Low-light Stereo Image Enhancement and De-noising in the Low-frequency
Information Enhanced Image Space [5.1569866461097185]
Methods are proposed to perform enhancement and de-noising simultaneously.
Low-frequency information enhanced module (IEM) is proposed to suppress noise and produce a new image space.
Cross-channel and spatial context information mining module (CSM) is proposed to encode long-range spatial dependencies.
An encoder-decoder structure is constructed, incorporating cross-view and cross-scale feature interactions.
arXiv Detail & Related papers (2024-01-15T15:03:32Z) - A cross Transformer for image denoising [83.68175077524111]
We propose a cross Transformer denoising CNN (CTNet) with a serial block (SB), a parallel block (PB), and a residual block (RB)
CTNet is superior to some popular denoising methods in terms of real and synthetic image denoising.
arXiv Detail & Related papers (2023-10-16T13:53:19Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior [70.46245698746874]
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks.
DiffBIR decouples blind image restoration problem into two stages: 1) degradation removal: removing image-independent content; 2) information regeneration: generating the lost image content.
In the first stage, we use restoration modules to remove degradations and obtain high-fidelity restored results.
For the second stage, we propose IRControlNet that leverages the generative ability of latent diffusion models to generate realistic details.
arXiv Detail & Related papers (2023-08-29T07:11:52Z) - EDICT: Exact Diffusion Inversion via Coupled Transformations [13.996171129586731]
Finding an initial noise vector that produces an input image when fed into the diffusion process (known as inversion) is an important problem.
We propose Exact Diffusion Inversion via Coupled Transformations (EDICT), an inversion method that draws inspiration from affine coupling layers.
EDICT enables mathematically exact inversion of real and model-generated images by maintaining two coupled noise vectors.
arXiv Detail & Related papers (2022-11-22T18:02:49Z) - Learning Parallax Transformer Network for Stereo Image JPEG Artifacts
Removal [17.289890973937318]
Under stereo settings, the performance of image JPEG artifacts removal can be further improved by exploiting the additional information provided by a second view.
We propose a novel parallax transformer network (PTNet) to integrate the information from stereo image pairs for stereo image JPEG artifacts removal.
arXiv Detail & Related papers (2022-07-15T08:21:53Z) - Unsupervised Misaligned Infrared and Visible Image Fusion via
Cross-Modality Image Generation and Registration [59.02821429555375]
We present a robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion.
To better fuse the registered infrared images and visible images, we present a feature Interaction Fusion Module (IFM)
arXiv Detail & Related papers (2022-05-24T07:51:57Z) - Reconstructing the Noise Manifold for Image Denoising [56.562855317536396]
We introduce the idea of a cGAN which explicitly leverages structure in the image noise space.
By learning directly a low dimensional manifold of the image noise, the generator promotes the removal from the noisy image only that information which spans this manifold.
Based on our experiments, our model substantially outperforms existing state-of-the-art architectures.
arXiv Detail & Related papers (2020-02-11T00:31:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.