SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain
Knowledge
- URL: http://arxiv.org/abs/2304.12556v1
- Date: Tue, 25 Apr 2023 03:54:58 GMT
- Title: SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain
Knowledge
- Authors: Ke Chen, Liangyan Li, Huan Liu, Yunzhe Li, Congling Tang and Jun Chen
- Abstract summary: We propose a new StereoSR method, named SwinFSR, based on an extension of SwinIR, originally designed for single image restoration.
For the efficient and accurate fusion of stereo views, we propose a new cross-attention module referred to as RCAM.
- Score: 27.344004897917515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stereo Image Super-Resolution (stereoSR) has attracted significant attention
in recent years due to the extensive deployment of dual cameras in mobile
phones, autonomous vehicles and robots. In this work, we propose a new StereoSR
method, named SwinFSR, based on an extension of SwinIR, originally designed for
single image restoration, and the frequency domain knowledge obtained by the
Fast Fourier Convolution (FFC). Specifically, to effectively gather global
information, we modify the Residual Swin Transformer blocks (RSTBs) in SwinIR
by explicitly incorporating the frequency domain knowledge using the FFC and
employing the resulting residual Swin Fourier Transformer blocks (RSFTBs) for
feature extraction. Besides, for the efficient and accurate fusion of stereo
views, we propose a new cross-attention module referred to as RCAM, which
achieves highly competitive performance while requiring less computational cost
than the state-of-the-art cross-attention modules. Extensive experimental
results and ablation studies demonstrate the effectiveness and efficiency of
our proposed SwinFSR.
Related papers
- Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution.
To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR.
Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z) - HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising.
We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration.
Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z) - RFR-WWANet: Weighted Window Attention-Based Recovery Feature Resolution
Network for Unsupervised Image Registration [7.446209993071451]
The Swin transformer has attracted attention in medical image analysis due to its computational efficiency and long-range modeling capability.
The registration models based on transformers combine multiple voxels into a single semantic token.
This merging process limits the transformers to model and generate coarse-grained spatial information.
We propose Recovery Feature Resolution Network (RFRNet), which allows the transformer to contribute fine-grained spatial information.
arXiv Detail & Related papers (2023-05-07T09:57:29Z) - Contextual Learning in Fourier Complex Field for VHR Remote Sensing
Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels)
We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA.
By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z) - ITSRN++: Stronger and Better Implicit Transformer Network for Continuous
Screen Content Image Super-Resolution [32.441761727608856]
The proposed method achieves state-of-the-art performance for SCI SR (outperforming SwinIR by 0.74 dB for x3 SR) and also works well for natural image SR.
We construct a large scale SCI2K dataset to facilitate the research on SCI SR.
arXiv Detail & Related papers (2022-10-17T07:47:34Z) - SwinFIR: Revisiting the SwinIR with Fast Fourier Convolution and
Improved Training for Image Super-Resolution [1.305100137416611]
We propose SwinFIR to extend SwinIR by replacing Fast Fourier Convolution (FFC) components.
Our algorithm achieves the PSNR of 32.83 dB on Manga109 dataset, which is 0.8 dB higher than the state-of-the-art SwinIR method.
arXiv Detail & Related papers (2022-08-24T01:04:47Z) - Residual Swin Transformer Channel Attention Network for Image
Demosaicing [3.8073142980733]
Deep neural networks have been widely used in image restoration, and in particular, in demosaicing, attaining significant performance improvement.
Inspired by the success of SwinIR, we propose in this paper a novel Swin Transformer-based network for image demosaicing, called RSTCANet.
arXiv Detail & Related papers (2022-04-14T16:45:17Z) - Transformer-based SAR Image Despeckling [53.99620005035804]
We introduce a transformer-based network for SAR image despeckling.
The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions.
Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods.
arXiv Detail & Related papers (2022-01-23T20:09:01Z) - Deep Burst Super-Resolution [165.90445859851448]
We propose a novel architecture for the burst super-resolution task.
Our network takes multiple noisy RAW images as input, and generates a denoised, super-resolved RGB image as output.
In order to enable training and evaluation on real-world data, we additionally introduce the BurstSR dataset.
arXiv Detail & Related papers (2021-01-26T18:57:21Z) - Frequency Consistent Adaptation for Real World Super Resolution [64.91914552787668]
We propose a novel Frequency Consistent Adaptation (FCA) that ensures the frequency domain consistency when applying Super-Resolution (SR) methods to the real scene.
We estimate degradation kernels from unsupervised images and generate the corresponding Low-Resolution (LR) images.
Based on the domain-consistent LR-HR pairs, we train easy-implemented Convolutional Neural Network (CNN) SR models.
arXiv Detail & Related papers (2020-12-18T08:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.