Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration
- URL: http://arxiv.org/abs/2412.03814v2
- Date: Wed, 11 Dec 2024 07:50:40 GMT
- Title: Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration
- Authors: Yuzhen Du, Teng Hu, Jiangning Zhang, Ran Yi Chengming Xu, Xiaobin Hu, Kai Wu, Donghao Luo, Yabiao Wang, Lizhuang Ma,
- Abstract summary: Image restoration aims to recover high-quality images from degraded inputs.
Existing methods lack a unified training benchmark for iterations and configurations.
We introduce a large-scale IR dataset called ReSyn, which employs a novel image filtering method based on image complexity.
- Score: 47.26304397935705
- License:
- Abstract: Image restoration (IR) aims to recover high-quality images from degraded inputs, with recent deep learning advancements significantly enhancing performance. However, existing methods lack a unified training benchmark for iterations and configurations. We also identify a bias in image complexity distributions between commonly used IR training and testing datasets, resulting in suboptimal restoration outcomes. To address this, we introduce a large-scale IR dataset called ReSyn, which employs a novel image filtering method based on image complexity to ensure a balanced distribution and includes both real and AIGC synthetic images. We establish a unified training standard that specifies iterations and configurations for image restoration models, focusing on measuring model convergence and restoration capability. Additionally, we enhance transformer-based image restoration models using linear attention mechanisms by proposing RWKV-IR, which integrates linear complexity RWKV into the transformer structure, allowing for both global and local receptive fields. Instead of directly using Vision-RWKV, we replace the original Q-Shift in RWKV with a Depth-wise Convolution shift to better model local dependencies, combined with Bi-directional attention for comprehensive linear attention. We also introduce a Cross-Bi-WKV module that merges two Bi-WKV modules with different scanning orders for balanced horizontal and vertical attention. Extensive experiments validate the effectiveness of our RWKV-IR model.
Related papers
- Enhanced Super-Resolution Training via Mimicked Alignment for Real-World Scenes [51.92255321684027]
We propose a novel plug-and-play module designed to mitigate misalignment issues by aligning LR inputs with HR images during training.
Specifically, our approach involves mimicking a novel LR sample that aligns with HR while preserving the characteristics of the original LR samples.
We comprehensively evaluate our method on synthetic and real-world datasets, demonstrating its effectiveness across a spectrum of SR models.
arXiv Detail & Related papers (2024-10-07T18:18:54Z) - Low-Res Leads the Way: Improving Generalization for Super-Resolution by
Self-Supervised Learning [45.13580581290495]
This work introduces a novel "Low-Res Leads the Way" (LWay) training framework to enhance the adaptability of SR models to real-world images.
Our approach utilizes a low-resolution (LR) reconstruction network to extract degradation embeddings from LR images, merging them with super-resolved outputs for LR reconstruction.
Our training regime is universally compatible, requiring no network architecture modifications, making it a practical solution for real-world SR applications.
arXiv Detail & Related papers (2024-03-05T02:29:18Z) - Deep Equilibrium Diffusion Restoration with Parallel Sampling [120.15039525209106]
Diffusion model-based image restoration (IR) aims to use diffusion models to recover high-quality (HQ) images from degraded images, achieving promising performance.
Most existing methods need long serial sampling chains to restore HQ images step-by-step, resulting in expensive sampling time and high computation costs.
In this work, we aim to rethink the diffusion model-based IR models through a different perspective, i.e., a deep equilibrium (DEQ) fixed point system, called DeqIR.
arXiv Detail & Related papers (2023-11-20T08:27:56Z) - Spectral Graphormer: Spectral Graph-based Transformer for Egocentric
Two-Hand Reconstruction using Multi-View Color Images [33.70056950818641]
We propose a novel transformer-based framework that reconstructs two high fidelity hands from multi-view RGB images.
We show that our framework is able to produce realistic two-hand reconstructions and demonstrate the generalisation of synthetic-trained models to real data.
arXiv Detail & Related papers (2023-08-21T20:07:02Z) - Recursive Generalization Transformer for Image Super-Resolution [108.67898547357127]
We propose the Recursive Generalization Transformer (RGT) for image SR, which can capture global spatial information and is suitable for high-resolution images.
We combine the RG-SA with local self-attention to enhance the exploitation of the global context.
Our RGT outperforms recent state-of-the-art methods quantitatively and qualitatively.
arXiv Detail & Related papers (2023-03-11T10:44:44Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Self-Supervised Coordinate Projection Network for Sparse-View Computed
Tomography [31.774432128324385]
We propose a Self-supervised COordinate Projection nEtwork (SCOPE) to reconstruct the artifacts-free CT image from a single SV sinogram.
Compared with recent related works that solve similar problems using implicit neural representation network (INR), our essential contribution is an effective and simple re-projection strategy.
arXiv Detail & Related papers (2022-09-12T06:14:04Z) - Universal Generative Modeling for Calibration-free Parallel Mr Imaging [13.875986147033002]
We present an unsupervised deep learning framework for calibration-free parallel MRI.
We make use of the merits of both wavelet transform and the adaptive iteration strategy in a unified framework.
We train a powerful noise conditional score network by forming wavelet tensor as the network input.
arXiv Detail & Related papers (2022-01-25T10:05:39Z) - Spectral Compressive Imaging Reconstruction Using Convolution and
Contextual Transformer [6.929652454131988]
We propose a hybrid network module, namely CCoT (Contextual Transformer) block, which can acquire the inductive bias ability of transformer simultaneously.
We integrate the proposed CCoT block into deep unfolding framework based on the generalized alternating projection algorithm, and further propose the GAP-CT network.
arXiv Detail & Related papers (2022-01-15T06:30:03Z) - Single-Image HDR Reconstruction by Learning to Reverse the Camera
Pipeline [100.5353614588565]
We propose to incorporate the domain knowledge of the LDR image formation pipeline into our model.
We model the HDRto-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization.
We demonstrate that the proposed method performs favorably against state-of-the-art single-image HDR reconstruction algorithms.
arXiv Detail & Related papers (2020-04-02T17:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.