Revisiting Image Deblurring with an Efficient ConvNet
- URL: http://arxiv.org/abs/2302.02234v1
- Date: Sat, 4 Feb 2023 20:42:46 GMT
- Title: Revisiting Image Deblurring with an Efficient ConvNet
- Authors: Lingyan Ruan, Mojtaba Bemana, Hans-peter Seidel, Karol Myszkowski, Bin
Chen
- Abstract summary: We propose a lightweight CNN network that features a large effective receptive field (ERF) and demonstrates comparable or even better performance than Transformers.
Our key design is an efficient CNN block dubbed LaKD, equipped with a large kernel depth-wise convolution and spatial-channel mixing structure.
We achieve +0.17dB / +0.43dB PSNR over the state-of-the-art Restormer on defocus / motion deblurring benchmark datasets with 32% fewer parameters and 39% fewer MACs.
- Score: 24.703240497171503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image deblurring aims to recover the latent sharp image from its blurry
counterpart and has a wide range of applications in computer vision. The
Convolution Neural Networks (CNNs) have performed well in this domain for many
years, and until recently an alternative network architecture, namely
Transformer, has demonstrated even stronger performance. One can attribute its
superiority to the multi-head self-attention (MHSA) mechanism, which offers a
larger receptive field and better input content adaptability than CNNs.
However, as MHSA demands high computational costs that grow quadratically with
respect to the input resolution, it becomes impractical for high-resolution
image deblurring tasks. In this work, we propose a unified lightweight CNN
network that features a large effective receptive field (ERF) and demonstrates
comparable or even better performance than Transformers while bearing less
computational costs. Our key design is an efficient CNN block dubbed LaKD,
equipped with a large kernel depth-wise convolution and spatial-channel mixing
structure, attaining comparable or larger ERF than Transformers but with a
smaller parameter scale. Specifically, we achieve +0.17dB / +0.43dB PSNR over
the state-of-the-art Restormer on defocus / motion deblurring benchmark
datasets with 32% fewer parameters and 39% fewer MACs. Extensive experiments
demonstrate the superior performance of our network and the effectiveness of
each module. Furthermore, we propose a compact and intuitive ERFMeter metric
that quantitatively characterizes ERF, and shows a high correlation to the
network performance. We hope this work can inspire the research community to
further explore the pros and cons of CNN and Transformer architectures beyond
image deblurring tasks.
Related papers
- OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation [70.17681136234202]
We reexamine the design distinctions and test the limits of what a sparse CNN can achieve.
We propose two key components, i.e., adaptive receptive fields (spatially) and adaptive relation, to bridge the gap.
This exploration led to the creation of Omni-Adaptive 3D CNNs (OA-CNNs), a family of networks that integrates a lightweight module.
arXiv Detail & Related papers (2024-03-21T14:06:38Z) - Multiscale Low-Frequency Memory Network for Improved Feature Extraction
in Convolutional Neural Networks [13.815116154370834]
We introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM) Network.
The MLFM efficiently preserves low-frequency information, enhancing performance in targeted computer vision tasks.
Our work builds upon the existing CNN foundations and paves the way for future advancements in computer vision.
arXiv Detail & Related papers (2024-03-13T00:48:41Z) - SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and
Transformers [18.073368359464915]
This paper presents a module, Spatial Cross-scale Convolution (SCSC), which is verified to be effective in improving both CNNs and Transformers.
On the face recognition task, FaceResNet with SCSC can improve 2.7% with 68% fewer FLOPs and 79% fewer parameters.
On the ImageNet classification task, Swin Transformer with SCSC can achieve even better performance with 22% fewer FLOPs, and ResNet with CSCS can improve 5.3% with similar complexity.
arXiv Detail & Related papers (2023-08-14T12:49:39Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - Image Super-resolution with An Enhanced Group Convolutional Neural
Network [102.2483249598621]
CNNs with strong learning abilities are widely chosen to resolve super-resolution problem.
We present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture.
Experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR.
arXiv Detail & Related papers (2022-05-29T00:34:25Z) - DDCNet: Deep Dilated Convolutional Neural Network for Dense Prediction [0.0]
A receptive field (ERF) and a higher resolution of spatial features within a network are essential for providing higher-resolution dense estimates.
We present a systemic approach to design network architectures that can provide a larger receptive field while maintaining a higher spatial feature resolution.
arXiv Detail & Related papers (2021-07-09T23:15:34Z) - Asymmetric CNN for image super-resolution [102.96131810686231]
Deep convolutional neural networks (CNNs) have been widely applied for low-level vision over the past five years.
We propose an asymmetric CNN (ACNet) comprising an asymmetric block (AB), a mem?ory enhancement block (MEB) and a high-frequency feature enhancement block (HFFEB) for image super-resolution.
Our ACNet can effectively address single image super-resolution (SISR), blind SISR and blind SISR of blind noise problems.
arXiv Detail & Related papers (2021-03-25T07:10:46Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Enhancing sensor resolution improves CNN accuracy given the same number
of parameters or FLOPS [53.10151901863263]
We show that it is almost always possible to modify a network such that it achieves higher accuracy at a higher input resolution while having the same number of parameters or/and FLOPS.
Preliminary empirical investigation over MNIST, Fashion MNIST, and CIFAR10 datasets demonstrates the efficiency of the proposed approach.
arXiv Detail & Related papers (2021-03-09T06:47:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.