Revisiting Dynamic Convolution via Matrix Decomposition
- URL: http://arxiv.org/abs/2103.08756v1
- Date: Mon, 15 Mar 2021 23:03:18 GMT
- Title: Revisiting Dynamic Convolution via Matrix Decomposition
- Authors: Yunsheng Li, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Dongdong Chen, Ye
Yu, Lu Yuan, Zicheng Liu, Mei Chen, Nuno Vasconcelos
- Abstract summary: We propose dynamic channel fusion to replace dynamic attention over channel groups.
Our method is easier to train and requires significantly fewer parameters without sacrificing accuracy.
- Score: 81.89967403872147
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research in dynamic convolution shows substantial performance boost
for efficient CNNs, due to the adaptive aggregation of K static convolution
kernels. It has two limitations: (a) it increases the number of convolutional
weights by K-times, and (b) the joint optimization of dynamic attention and
static convolution kernels is challenging. In this paper, we revisit it from a
new perspective of matrix decomposition and reveal the key issue is that
dynamic convolution applies dynamic attention over channel groups after
projecting into a higher dimensional latent space. To address this issue, we
propose dynamic channel fusion to replace dynamic attention over channel
groups. Dynamic channel fusion not only enables significant dimension reduction
of the latent space, but also mitigates the joint optimization difficulty. As a
result, our method is easier to train and requires significantly fewer
parameters without sacrificing accuracy. Source code is at
https://github.com/liyunsheng13/dcd.
Related papers
- SGDM: Static-Guided Dynamic Module Make Stronger Visual Models [0.9012198585960443]
spatial attention mechanism has been widely used to improve object detection performance.
We propose Razor Dynamic Convolution (RDConv) to address thetwo flaws in dynamic weight convolution.
We introduce the mechanism of shared weights in static convolution to solve the problem of dynamic convolution being sensitive to high-frequency noise.
arXiv Detail & Related papers (2024-03-27T06:18:40Z) - DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention [53.02648818164273]
We present an efficient yet effective attention mechanism, namely the Dynamic Bilinear Low-Rank Attention (DBA)
DBA compresses the sequence length by input-sensitive dynamic projection matrices and achieves linear time and space complexity.
Experiments over tasks with diverse sequence length conditions show that DBA achieves state-of-the-art performance.
arXiv Detail & Related papers (2022-11-24T03:06:36Z) - Adaptive Dynamic Filtering Network for Image Denoising [8.61083713580388]
In image denoising networks, feature scaling is widely used to enlarge the receptive field size and reduce computational costs.
We propose to employ dynamic convolution to improve the learning of high-frequency and multi-scale features.
We build an efficient denoising network with the proposed DCB and MDCB, named ADFNet.
arXiv Detail & Related papers (2022-11-22T06:54:27Z) - Learning Cross-view Geo-localization Embeddings via Dynamic Weighted
Decorrelation Regularization [52.493240055559916]
Cross-view geo-localization aims to spot images of the same location shot from two platforms, e.g., the drone platform and the satellite platform.
Existing methods usually focus on optimizing the distance between one embedding with others in the feature space.
In this paper, we argue that the low redundancy is also of importance, which motivates the model to mine more diverse patterns.
arXiv Detail & Related papers (2022-11-10T02:13:10Z) - Omni-Dimensional Dynamic Convolution [25.78940854339179]
Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs)
Recent research in dynamic convolution shows that learning a linear combination of $n$ convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs.
We present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design.
arXiv Detail & Related papers (2022-09-16T14:05:38Z) - SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution [16.56592303409295]
Dynamic convolution achieves better performance for efficient CNNs at the cost of negligible FLOPs increase.
We propose a new framework, textbfSparse Dynamic Convolution (textscSD-Conv), to naturally integrate these two paths.
arXiv Detail & Related papers (2022-04-05T14:03:54Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Decoupled Dynamic Filter Networks [85.38058820176047]
We propose the Decoupled Dynamic Filter (DDF) that can simultaneously tackle both of these shortcomings.
Inspired by recent advances in attention, DDF decouples a depth-wise dynamic filter into spatial and channel dynamic filters.
We observe a significant boost in performance when replacing standard convolution with DDF in classification networks.
arXiv Detail & Related papers (2021-04-29T04:55:33Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.