Implicit Grid Convolution for Multi-Scale Image Super-Resolution
- URL: http://arxiv.org/abs/2408.09674v1
- Date: Mon, 19 Aug 2024 03:30:15 GMT
- Title: Implicit Grid Convolution for Multi-Scale Image Super-Resolution
- Authors: Dongheon Lee, Seokju Yun, Youngmin Ro,
- Abstract summary: We propose a framework for training multiple integer scales simultaneously with a single model.
We use a single encoder to extract features and introduce a novel upsampler, Implicit Grid Convolution(IGConv)
Our experiments demonstrate that training multiple scales with a single model reduces the training budget and stored parameters by one-third.
- Score: 6.8410780175245165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Super-Resolution (SR) achieved significant performance improvement by employing neural networks. Most SR methods conventionally train a single model for each targeted scale, which increases redundancy in training and deployment in proportion to the number of scales targeted. This paper challenges this conventional fixed-scale approach. Our preliminary analysis reveals that, surprisingly, encoders trained at different scales extract similar features from images. Furthermore, the commonly used scale-specific upsampler, Sub-Pixel Convolution (SPConv), exhibits significant inter-scale correlations. Based on these observations, we propose a framework for training multiple integer scales simultaneously with a single model. We use a single encoder to extract features and introduce a novel upsampler, Implicit Grid Convolution~(IGConv), which integrates SPConv at all scales within a single module to predict multiple scales. Our extensive experiments demonstrate that training multiple scales with a single model reduces the training budget and stored parameters by one-third while achieving equivalent inference latency and comparable performance. Furthermore, we propose IGConv$^{+}$, which addresses spectral bias and input-independent upsampling and uses ensemble prediction to improve performance. As a result, SRFormer-IGConv$^{+}$ achieves a remarkable 0.25dB improvement in PSNR at Urban100$\times$4 while reducing the training budget, stored parameters, and inference cost compared to the existing SRFormer.
Related papers
- QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution [53.13952833016505]
We propose a low-bit quantization model for real-world video super-resolution (VSR)<n>We use a calibration dataset to measure both spatial and temporal complexity for each layer.<n>We refine the FP and low-bit branches to achieve simultaneous optimization.
arXiv Detail & Related papers (2025-08-06T14:35:59Z) - Robust Hyperspectral Image Panshapring via Sparse Spatial-Spectral Representation [9.3350274016294]
S$3$RNet is a novel framework for hyperspectral image pansharpening.
It combines low-resolution hyperspectral images (LRHSI) with high-resolution multispectral images (HRMSI) through sparse spatial-spectral representation.
S$3$RNet achieves state-of-the-art performance across multiple evaluation metrics.
arXiv Detail & Related papers (2025-01-14T09:09:14Z) - VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression [59.14355576912495]
NeRF-based video has revolutionized visual media by delivering photorealistic Free-Viewpoint Video (FVV) experiences.
The substantial data volumes pose significant challenges for storage and transmission.
We propose VRVVC, a novel end-to-end joint variable-rate framework for video compression.
arXiv Detail & Related papers (2024-12-16T01:28:04Z) - PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [87.89013794655207]
Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps.
We propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR.
Our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR.
arXiv Detail & Related papers (2024-11-26T04:49:42Z) - Task-Aware Dynamic Transformer for Efficient Arbitrary-Scale Image Super-Resolution [8.78015409192613]
Arbitrary-scale super-resolution (ASSR) aims to learn a single model for image super-resolution at arbitrary magnifying scales.
Existing ASSR networks typically comprise an off-the-shelf scale-agnostic feature extractor and an arbitrary scale upsampler.
We propose a Task-Aware Dynamic Transformer (TADT) as an input-adaptive feature extractor for efficient image ASSR.
arXiv Detail & Related papers (2024-08-16T13:35:52Z) - Scale Equalization for Multi-Level Feature Fusion [8.541075075344438]
We find that multi-level features from parallel branches are on different scales.
The scale disequilibrium is a universal and unwanted flaw that leads to detrimental gradient descent.
We propose injecting scale equalizers to achieve scale equilibrium across multi-level features after bilinear upsampling.
arXiv Detail & Related papers (2024-02-02T05:25:51Z) - Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR)
CFSR inherits the advantages of both convolution-based and transformer-based approaches.
Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z) - Efficient Controllable Multi-Task Architectures [85.76598445904374]
We propose a multi-task model consisting of a shared encoder and task-specific decoders where both encoder and decoder channel widths are slimmable.
Our key idea is to control the task importance by varying the capacities of task-specific decoders, while controlling the total computational cost.
This improves overall accuracy by allowing a stronger encoder for a given budget, increases control over computational cost, and delivers high-quality slimmed sub-architectures.
arXiv Detail & Related papers (2023-08-22T19:09:56Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Effective Invertible Arbitrary Image Rescaling [77.46732646918936]
Invertible Neural Networks (INN) are able to increase upscaling accuracy significantly by optimizing the downscaling and upscaling cycle jointly.
A simple and effective invertible arbitrary rescaling network (IARN) is proposed to achieve arbitrary image rescaling by training only one model in this work.
It is shown to achieve a state-of-the-art (SOTA) performance in bidirectional arbitrary rescaling without compromising perceptual quality in LR outputs.
arXiv Detail & Related papers (2022-09-26T22:22:30Z) - Scale-Aware Dynamic Network for Continuous-Scale Super-Resolution [16.67263192454279]
We propose a scale-aware dynamic network (SADN) for continuous-scale SR.
First, we propose a scale-aware dynamic convolutional (SAD-Conv) layer for the feature learning of multiple SR tasks with various scales.
Second, we devise a continuous-scale upsampling module (CSUM) with the multi-bilinear local implicit function (MBLIF) for any-scale upsampling.
arXiv Detail & Related papers (2021-10-29T09:57:48Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - Exploring Multi-Scale Feature Propagation and Communication for Image
Super Resolution [37.91175933401261]
We present a unified formulation over widely-used multi-scale structures.
We propose a generic and efficient multi-scale convolution unit -- Multi-Scale cross-Scale Share-weights convolution (MS$3$-Conv)
arXiv Detail & Related papers (2020-08-01T10:44:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.