Related papers: Revisiting Image Deblurring with an Efficient ConvNet

Revisiting Image Deblurring with an Efficient ConvNet

URL: http://arxiv.org/abs/2302.02234v1
Date: Sat, 4 Feb 2023 20:42:46 GMT
Title: Revisiting Image Deblurring with an Efficient ConvNet
Authors: Lingyan Ruan, Mojtaba Bemana, Hans-peter Seidel, Karol Myszkowski, Bin Chen
Abstract summary: We propose a lightweight CNN network that features a large effective receptive field (ERF) and demonstrates comparable or even better performance than Transformers. Our key design is an efficient CNN block dubbed LaKD, equipped with a large kernel depth-wise convolution and spatial-channel mixing structure. We achieve +0.17dB / +0.43dB PSNR over the state-of-the-art Restormer on defocus / motion deblurring benchmark datasets with 32% fewer parameters and 39% fewer MACs.
Score: 24.703240497171503
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image deblurring aims to recover the latent sharp image from its blurry counterpart and has a wide range of applications in computer vision. The Convolution Neural Networks (CNNs) have performed well in this domain for many years, and until recently an alternative network architecture, namely Transformer, has demonstrated even stronger performance. One can attribute its superiority to the multi-head self-attention (MHSA) mechanism, which offers a larger receptive field and better input content adaptability than CNNs. However, as MHSA demands high computational costs that grow quadratically with respect to the input resolution, it becomes impractical for high-resolution image deblurring tasks. In this work, we propose a unified lightweight CNN network that features a large effective receptive field (ERF) and demonstrates comparable or even better performance than Transformers while bearing less computational costs. Our key design is an efficient CNN block dubbed LaKD, equipped with a large kernel depth-wise convolution and spatial-channel mixing structure, attaining comparable or larger ERF than Transformers but with a smaller parameter scale. Specifically, we achieve +0.17dB / +0.43dB PSNR over the state-of-the-art Restormer on defocus / motion deblurring benchmark datasets with 32% fewer parameters and 39% fewer MACs. Extensive experiments demonstrate the superior performance of our network and the effectiveness of each module. Furthermore, we propose a compact and intuitive ERFMeter metric that quantitatively characterizes ERF, and shows a high correlation to the network performance. We hope this work can inspire the research community to further explore the pros and cons of CNN and Transformer architectures beyond image deblurring tasks.

Related papers

Impoola: The Power of Average Pooling for Image-Based Deep Reinforcement Learning [1.2937020918620652]
We show that replacing the flattening of output feature maps in Impala-CNN with global average pooling leads to a notable performance improvement. A decrease in the network's translation sensitivity may be central to this improvement. Our results demonstrate that network scaling is not just about increasing model size - efficient network design is also an essential factor.
arXiv Detail & Related papers (2025-03-07T16:19:19Z)
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve. We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap. This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z)
Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks [13.815116154370834]
We introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM) Network. The MLFM efficiently preserves low-frequency information, enhancing performance in targeted computer vision tasks. Our work builds upon the existing CNN foundations and paves the way for future advancements in computer vision.
arXiv Detail & Related papers (2024-03-13T00:48:41Z)
SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers [18.073368359464915]
This paper presents a module, Spatial Cross-scale Convolution (SCSC), which is verified to be effective in improving both CNNs and Transformers. On the face recognition task, FaceResNet with SCSC can improve 2.7% with 68% fewer FLOPs and 79% fewer parameters. On the ImageNet classification task, Swin Transformer with SCSC can achieve even better performance with 22% fewer FLOPs, and ResNet with CSCS can improve 5.3% with similar complexity.
arXiv Detail & Related papers (2023-08-14T12:49:39Z)
Cross-receptive Focused Inference Network for Lightweight Image Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks. Transformers that need to incorporate contextual information to extract features dynamically are neglected. We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z)
Image Super-resolution with An Enhanced Group Convolutional Neural Network [102.2483249598621]
CNNs with strong learning abilities are widely chosen to resolve super-resolution problem. We present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture. Experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR.
arXiv Detail & Related papers (2022-05-29T00:34:25Z)
DDCNet: Deep Dilated Convolutional Neural Network for Dense Prediction [0.0]
A receptive field (ERF) and a higher resolution of spatial features within a network are essential for providing higher-resolution dense estimates. We present a systemic approach to design network architectures that can provide a larger receptive field while maintaining a higher spatial feature resolution.
arXiv Detail & Related papers (2021-07-09T23:15:34Z)
Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years. We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution. Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z)
Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain. In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden. Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z)
Enhancing sensor resolution improves CNN accuracy given the same number of parameters or FLOPS [53.10151901863263]
We show that it is almost always possible to modify a network such that it achieves higher accuracy at a higher input resolution while having the same number of parameters or/and FLOPS. Preliminary empirical investigation over MNIST, Fashion MNIST, and CIFAR10 datasets demonstrates the efficiency of the proposed approach.
arXiv Detail & Related papers (2021-03-09T06:47:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.