SwiftSRGAN -- Rethinking Super-Resolution for Efficient and Real-time
Inference
- URL: http://arxiv.org/abs/2111.14320v1
- Date: Mon, 29 Nov 2021 04:20:15 GMT
- Title: SwiftSRGAN -- Rethinking Super-Resolution for Efficient and Real-time
Inference
- Authors: Koushik Sivarama Krishnan, Karthik Sivarama Krishnan
- Abstract summary: We present an architecture that is faster and smaller in terms of its memory footprint.
A real-time super-resolution enables streaming high resolution media content even under poor bandwidth conditions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, there have been several advancements in the task of image
super-resolution using the state of the art Deep Learning-based architectures.
Many super-resolution-based techniques previously published, require high-end
and top-of-the-line Graphics Processing Unit (GPUs) to perform image
super-resolution. With the increasing advancements in Deep Learning approaches,
neural networks have become more and more compute hungry. We took a step back
and, focused on creating a real-time efficient solution. We present an
architecture that is faster and smaller in terms of its memory footprint. The
proposed architecture uses Depth-wise Separable Convolutions to extract
features and, it performs on-par with other super-resolution GANs (Generative
Adversarial Networks) while maintaining real-time inference and a low memory
footprint. A real-time super-resolution enables streaming high resolution media
content even under poor bandwidth conditions. While maintaining an efficient
trade-off between the accuracy and latency, we are able to produce a comparable
performance model which is one-eighth (1/8) the size of super-resolution GANs
and computes 74 times faster than super-resolution GANs.
Related papers
- Hierarchical Patch Diffusion Models for High-Resolution Video Generation [50.42746357450949]
We develop deep context fusion, which propagates context information from low-scale to high-scale patches in a hierarchical manner.
We also propose adaptive computation, which allocates more network capacity and computation towards coarse image details.
The resulting model sets a new state-of-the-art FVD score of 66.32 and Inception Score of 87.68 in class-conditional video generation.
arXiv Detail & Related papers (2024-06-12T01:12:53Z) - RDRN: Recursively Defined Residual Network for Image Super-Resolution [58.64907136562178]
Deep convolutional neural networks (CNNs) have obtained remarkable performance in single image super-resolution.
We propose a novel network architecture which utilizes attention blocks efficiently.
arXiv Detail & Related papers (2022-11-17T11:06:29Z) - Rethinking Resolution in the Context of Efficient Video Recognition [49.957690643214576]
Cross-resolution KD (ResKD) is a simple but effective method to boost recognition accuracy on low-resolution frames.
We extensively demonstrate its effectiveness over state-of-the-art architectures, i.e., 3D-CNNs and Video Transformers.
arXiv Detail & Related papers (2022-09-26T15:50:44Z) - Image Super-resolution with An Enhanced Group Convolutional Neural
Network [102.2483249598621]
CNNs with strong learning abilities are widely chosen to resolve super-resolution problem.
We present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture.
Experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR.
arXiv Detail & Related papers (2022-05-29T00:34:25Z) - Hybrid Pixel-Unshuffled Network for Lightweight Image Super-Resolution [64.54162195322246]
Convolutional neural network (CNN) has achieved great success on image super-resolution (SR)
Most deep CNN-based SR models take massive computations to obtain high performance.
We propose a novel Hybrid Pixel-Unshuffled Network (HPUN) by introducing an efficient and effective downsampling module into the SR task.
arXiv Detail & Related papers (2022-03-16T20:10:41Z) - Deep Networks for Image and Video Super-Resolution [30.75380029218373]
Single image super-resolution (SISR) is built using efficient convolutional units we refer to as mixed-dense connection blocks (MDCB)
We train two versions of our network to enhance complementary image qualities using different loss configurations.
We further employ our network for super-resolution task, where our network learns to aggregate information from multiple frames and maintain-temporal consistency.
arXiv Detail & Related papers (2022-01-28T09:15:21Z) - Attaining Real-Time Super-Resolution for Microscopic Images Using GAN [0.06345523830122167]
This paper focuses on improving an existing deep-learning based method to perform Super-Resolution Microscopy in real-time using a standard GPU.
We suggest simple changes to the architecture of the generator and the discriminator of SRGAN.
We compare the quality and the running time for the outputs produced by our model, opening its applications in different areas like low-end benchtop and even mobile microscopy.
arXiv Detail & Related papers (2020-10-09T15:26:21Z) - OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling
Network [3.6683231417848283]
We introduce OverNet, a deep but lightweight convolutional network to solve SISR at arbitrary scale factors with a single model.
We show that our network outperforms previous state-of-the-art results in standard benchmarks while using fewer parameters than previous approaches.
arXiv Detail & Related papers (2020-08-05T22:10:29Z) - Real-time Semantic Segmentation with Fast Attention [94.88466483540692]
We propose a novel architecture for semantic segmentation of high-resolution images and videos in real-time.
The proposed architecture relies on our fast spatial attention, which is a simple yet efficient modification of the popular self-attention mechanism.
We show that results on multiple datasets demonstrate superior performance with better accuracy and speed compared to existing approaches.
arXiv Detail & Related papers (2020-07-07T22:37:16Z) - FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale
Context Aggregation and Feature Space Super-resolution [14.226301825772174]
We introduce a novel and efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP)
It is a lightweight cascaded structure for Convolutional Neural Networks (CNNs) to efficiently leverage context information.
We achieve 68.4% mIoU at 84 fps on the Cityscapes test set with a single Nivida Titan X (Maxwell) GPU card.
arXiv Detail & Related papers (2020-03-09T03:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.