Related papers: SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution

URL: http://arxiv.org/abs/2204.02227v3
Date: Fri, 26 May 2023 12:26:12 GMT
Title: SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution
Authors: Shwai He, Chenbo Jiang, Daize Dong, Liang Ding
Abstract summary: Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase. We propose a new framework, textbfSparse Dynamic Convolution (textscSD-Conv), to naturally integrate these two paths.
Score: 16.56592303409295
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase. However, the performance increase can not match the significantly expanded number of parameters, which is the main bottleneck in real-world applications. Contrastively, mask-based unstructured pruning obtains a lightweight network by removing redundancy in the heavy network. In this paper, we propose a new framework, \textbf{Sparse Dynamic Convolution} (\textsc{SD-Conv}), to naturally integrate these two paths such that it can inherit the advantage of dynamic mechanism and sparsity. We first design a binary mask derived from a learnable threshold to prune static kernels, significantly reducing the parameters and computational cost but achieving higher performance in Imagenet-1K. We further transfer pretrained models into a variety of downstream tasks, showing consistently better results than baselines. We hope our SD-Conv could be an efficient alternative to conventional dynamic convolutions.

Related papers

Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning [4.7027290803102675]
We propose an efficient training method for CNN compression via dynamic parameter rank pruning. Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.
arXiv Detail & Related papers (2024-01-15T23:52:35Z)
Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR) CFSR inherits the advantages of both convolution-based and transformer-based approaches. Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z)
Incorporating Transformer Designs into Convolutions for Lightweight Image Super-Resolution [46.32359056424278]
Large convolutional kernels have become popular in designing convolutional neural networks. The increase in kernel size also leads to a quadratic growth in the number of parameters, resulting in heavy computation and memory requirements. We propose a neighborhood attention (NA) module that upgrades the standard convolution with a self-attention mechanism. Building upon the NA module, we propose a lightweight single image super-resolution (SISR) network named TCSR.
arXiv Detail & Related papers (2023-03-25T01:32:18Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
PAD-Net: An Efficient Framework for Dynamic Networks [72.85480289152719]
Common practice in implementing dynamic networks is to convert the given static layers into fully dynamic ones. We propose a partially dynamic network, namely PAD-Net, to transform the redundant dynamic parameters into static ones. Our method is comprehensively supported by large-scale experiments with two typical advanced dynamic architectures.
arXiv Detail & Related papers (2022-11-10T12:42:43Z)
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels. We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z)
Content-Aware Convolutional Neural Networks [98.97634685964819]
Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers. We propose a Content-aware Convolution (CAC) that automatically detects the smooth windows and applies a 1x1 convolutional kernel to replace the original large kernel.
arXiv Detail & Related papers (2021-06-30T03:54:35Z)
Revisiting Dynamic Convolution via Matrix Decomposition [81.89967403872147]
We propose dynamic channel fusion to replace dynamic attention over channel groups. Our method is easier to train and requires significantly fewer parameters without sacrificing accuracy.
arXiv Detail & Related papers (2021-03-15T23:03:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.