WaveMixSR-V2: Enhancing Super-resolution with Higher Efficiency
- URL: http://arxiv.org/abs/2409.10582v3
- Date: Wed, 30 Oct 2024 15:16:43 GMT
- Title: WaveMixSR-V2: Enhancing Super-resolution with Higher Efficiency
- Authors: Pranav Jeevan, Neeraj Nixon, Amit Sethi,
- Abstract summary: We present an enhanced version of the WaveMixSR architecture by replacing the traditional convolution layer with a pixel shuffle operation.
Our experiments demonstrate that our enhanced model -- WaveMixSR-V2 -- outperforms other architectures in multiple super-resolution tasks.
- Score: 4.093503153499691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in single image super-resolution have been predominantly driven by token mixers and transformer architectures. WaveMixSR utilized the WaveMix architecture, employing a two-dimensional discrete wavelet transform for spatial token mixing, achieving superior performance in super-resolution tasks with remarkable resource efficiency. In this work, we present an enhanced version of the WaveMixSR architecture by (1) replacing the traditional transpose convolution layer with a pixel shuffle operation and (2) implementing a multistage design for higher resolution tasks ($4\times$). Our experiments demonstrate that our enhanced model -- WaveMixSR-V2 -- outperforms other architectures in multiple super-resolution tasks, achieving state-of-the-art for the BSD100 dataset, while also consuming fewer resources, exhibits higher parameter efficiency, lower latency and higher throughput. Our code is available at https://github.com/pranavphoenix/WaveMixSR.
Related papers
- WaveMixSR: A Resource-efficient Neural Network for Image
Super-resolution [2.0477182014909205]
We propose a new neural network -- WaveMixSR -- for image super-resolution based on WaveMix architecture.
WaveMixSR achieves competitive performance in all datasets and reaches state-of-the-art performance in the BSD100 dataset on multiple super-resolution tasks.
arXiv Detail & Related papers (2023-07-01T21:25:03Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - ShuffleMixer: An Efficient ConvNet for Image Super-Resolution [88.86376017828773]
We propose ShuffleMixer, for lightweight image super-resolution that explores large convolution and channel split-shuffle operation.
Specifically, we develop a large depth-wise convolution and two projection layers based on channel splitting and shuffling as the basic component to mix features efficiently.
Experimental results demonstrate that the proposed ShuffleMixer is about 6x smaller than the state-of-the-art methods in terms of model parameters and FLOPs.
arXiv Detail & Related papers (2022-05-30T15:26:52Z) - WaveMix: A Resource-efficient Neural Network for Image Analysis [3.4927288761640565]
WaveMix is resource-efficient and yet generalizable and scalable.
Networks achieve comparable or better accuracy than the state-of-the-art convolutional neural networks.
WaveMix establishes new benchmarks for segmentation on Cityscapes.
arXiv Detail & Related papers (2022-05-28T09:08:50Z) - WaveMix: Resource-efficient Token Mixing for Images [2.7188347260210466]
We present WaveMix as an alternative neural architecture that uses a multi-scale 2D discrete wavelet transform (DWT) for spatial token mixing.
WaveMix has achieved State-of-the-art (SOTA) results in EMNIST Byclass and EMNIST Balanced datasets.
arXiv Detail & Related papers (2022-03-07T20:15:17Z) - Fourier Space Losses for Efficient Perceptual Image Super-Resolution [131.50099891772598]
We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions.
We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality.
The trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
arXiv Detail & Related papers (2021-06-01T20:34:52Z) - Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years.
We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.
Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z) - UltraSR: Spatial Encoding is a Missing Key for Implicit Image
Function-based Arbitrary-Scale Super-Resolution [74.82282301089994]
In this work, we propose UltraSR, a simple yet effective new network design based on implicit image functions.
We show that spatial encoding is indeed a missing key towards the next-stage high-accuracy implicit image function.
Our UltraSR sets new state-of-the-art performance on the DIV2K benchmark under all super-resolution scales.
arXiv Detail & Related papers (2021-03-23T17:36:42Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.