Related papers: Resolution Enhancement Processing on Low Quality Images Using Swin Transformer Based on Interval Dense Connection Strategy

Resolution Enhancement Processing on Low Quality Images Using Swin Transformer Based on Interval Dense Connection Strategy

URL: http://arxiv.org/abs/2303.09190v2
Date: Sat, 13 May 2023 06:54:14 GMT
Title: Resolution Enhancement Processing on Low Quality Images Using Swin Transformer Based on Interval Dense Connection Strategy
Authors: Rui-Yang Ju, Chih-Chia Chen, Jen-Shiun Chiang, Yu-Shian Lin, Wei-Han Chen, Chun-Tse Chien
Abstract summary: Transformer-based method has demonstrated remarkable performance for image super-resolution in comparison to the method based on the convolutional neural networks (CNNs) This research work proposes the Interval Dense Connection Strategy, which connects different blocks according to the newly designed algorithm. For real-life application, this work applies the lastest version of You Only Look Once (YOLOv8) model and the proposed model to perform object detection and real-life image super-resolution on low-quality images.
Score: 1.5705307898493193
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Transformer-based method has demonstrated remarkable performance for image super-resolution in comparison to the method based on the convolutional neural networks (CNNs). However, using the self-attention mechanism like SwinIR (Image Restoration Using Swin Transformer) to extract feature information from images needs a significant amount of computational resources, which limits its application on low computing power platforms. To improve the model feature reuse, this research work proposes the Interval Dense Connection Strategy, which connects different blocks according to the newly designed algorithm. We apply this strategy to SwinIR and present a new model, which named SwinOIR (Object Image Restoration Using Swin Transformer). For image super-resolution, an ablation study is conducted to demonstrate the positive effect of the Interval Dense Connection Strategy on the model performance. Furthermore, we evaluate our model on various popular benchmark datasets, and compare it with other state-of-the-art (SOTA) lightweight models. For example, SwinOIR obtains a PSNR of 26.62 dB for x4 upscaling image super-resolution on Urban100 dataset, which is 0.15 dB higher than the SOTA model SwinIR. For real-life application, this work applies the lastest version of You Only Look Once (YOLOv8) model and the proposed model to perform object detection and real-life image super-resolution on low-quality images. This implementation code is publicly available at https://github.com/Rubbbbbbbbby/SwinOIR.

Related papers

Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z)
Degradation-Aware Self-Attention Based Transformer for Blind Image Super-Resolution [23.336576280389608]
We propose a degradation-aware self-attention-based Transformer model for learning the degradation representations of input images with unknown noise. We apply our proposed model to several popular large-scale benchmark datasets for testing, and achieve the state-of-the-art performance. Our method yields a PSNR of 32.43 dB on the Urban100 dataset at $times$2 scale, 0.94 dB higher than DASR, and 26.62 dB on the Urban100 dataset at $times$4 scale, 0.26 dB improvement over KDSR.
arXiv Detail & Related papers (2023-10-06T11:52:31Z)
HAT: Hybrid Attention Transformer for Image Restoration [61.74223315807691]
Transformer-based methods have shown impressive performance in image restoration tasks, such as image super-resolution and denoising. We propose a new Hybrid Attention Transformer (HAT) to activate more input pixels for better restoration. Our HAT achieves state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2023-09-11T05:17:55Z)
ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image Compression [18.05997169440533]
We propose ConvNeXt-ChARM, an efficient ConvNeXt-based transform coding framework, paired with a compute-efficient channel-wise auto-regressive auto-regressive. We show that ConvNeXt-ChARM brings consistent and significant BD-rate (PSNR) reductions estimated on average to 5.24% and 1.22% over the versatile video coding (VVC) reference encoder (VTM-18.0) and the state-of-the-art learned image compression method SwinT-ChARM.
arXiv Detail & Related papers (2023-07-12T11:45:54Z)
Cascaded Cross-Attention Networks for Data-Efficient Whole-Slide Image Classification Using Transformers [0.11219061154635457]
Whole-Slide Imaging allows for the capturing and digitization of high-resolution images of histological specimen. transformer architecture has been proposed as a possible candidate for effectively leveraging the high-resolution information. We propose a novel cascaded cross-attention network (CCAN) based on the cross-attention mechanism that scales linearly with the number of extracted patches.
arXiv Detail & Related papers (2023-05-11T16:42:24Z)
Combining Attention Module and Pixel Shuffle for License Plate Super-Resolution [3.8831062015253055]
This work focuses on license plate (LP) reconstruction in low-resolution and low-quality images. We present a Single-Image Super-Resolution (SISR) approach that extends the attention/transformer module concept. In our experiments, the proposed method outperformed the baselines both quantitatively and qualitatively.
arXiv Detail & Related papers (2022-10-30T13:05:07Z)
Learning Transformer Features for Image Quality Assessment [53.51379676690971]
We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features. The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
arXiv Detail & Related papers (2021-12-01T13:23:00Z)
Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution. We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z)
Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision. We propose a novel blind image restoration method, aiming to integrate both the advantages of them. Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z)
Radon cumulative distribution transform subspace modeling for image classification [18.709734704950804]
We present a new supervised image classification method applicable to a broad class of image deformation models. The method makes use of the previously described Radon Cumulative Distribution Transform (R-CDT) for image data. In addition to the test accuracy performances, we show improvements in terms of computational efficiency.
arXiv Detail & Related papers (2020-04-07T19:47:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.