Related papers: KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters

KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters

URL: http://arxiv.org/abs/2503.23379v1
Date: Sun, 30 Mar 2025 09:54:07 GMT
Title: KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters
Authors: Haiduo Huang, Yadong Zhang, Pengju Ren,
Abstract summary: We propose a lightweight convolution kernel plug-in, named KernelDNA.<n>It decouples kernel adaptation into input-dependent dynamic routing and pre-trained static modulation.<n> Experiments on image classification and dense prediction tasks demonstrate that KernelDNA achieves state-of-the-art accuracy-efficiency balance among dynamic convolution variants.
Score: 4.279428843696974
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dynamic convolution enhances model capacity by adaptively combining multiple kernels, yet faces critical trade-offs: prior works either (1) incur significant parameter overhead by scaling kernel numbers linearly, (2) compromise inference speed through complex kernel interactions, or (3) struggle to jointly optimize dynamic attention and static kernels. We also observe that pre-trained Convolutional Neural Networks (CNNs) exhibit inter-layer redundancy akin to that in Large Language Models (LLMs). Specifically, dense convolutional layers can be efficiently replaced by derived ``child" layers generated from a shared ``parent" convolutional kernel through an adapter. To address these limitations and implement the weight-sharing mechanism, we propose a lightweight convolution kernel plug-in, named KernelDNA. It decouples kernel adaptation into input-dependent dynamic routing and pre-trained static modulation, ensuring both parameter efficiency and hardware-friendly inference. Unlike existing dynamic convolutions that expand parameters via multi-kernel ensembles, our method leverages cross-layer weight sharing and adapter-based modulation, enabling dynamic kernel specialization without altering the standard convolution structure. This design preserves the native computational efficiency of standard convolutions while enhancing representation power through input-adaptive kernel adjustments. Experiments on image classification and dense prediction tasks demonstrate that KernelDNA achieves state-of-the-art accuracy-efficiency balance among dynamic convolution variants. Our codes are available at https://github.com/haiduo/KernelDNA.

Related papers

Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity [11.644182973599788]
diagonal adaptive kernel model learns kernel eigenvalues and output coefficients simultaneously during training.<n>We show that the adaptivity comes from learning the right eigenvalues during training.
arXiv Detail & Related papers (2025-01-15T09:20:02Z)
KernelWarehouse: Rethinking the Design of Dynamic Convolution [16.101179962553385]
KernelWarehouse redefines the basic concepts of Kernels", assembling kernels" and attention function" We testify the effectiveness of KernelWarehouse on ImageNet and MS-COCO datasets using various ConvNet architectures.
arXiv Detail & Related papers (2024-06-12T05:16:26Z)
Kernel-U-Net: Multivariate Time Series Forecasting using Custom Kernels [1.8816077341295625]
We introduce Kernel-U-Net, a flexible and kernel-customizable U-shape neural network architecture. Specifically, Kernel-U-Net separates the procedure of partitioning input time series into patches from kernel manipulation. Our method offers two primary advantages: 1) Flexibility in kernel customization to adapt to specific datasets; and 2) Enhanced computational efficiency, with the complexity of the Transformer layer reduced to linear.
arXiv Detail & Related papers (2024-01-03T00:49:51Z)
Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed. We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords. Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z)
Omni-Dimensional Dynamic Convolution [25.78940854339179]
Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs) Recent research in dynamic convolution shows that learning a linear combination of $n$ convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs. We present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design.
arXiv Detail & Related papers (2022-09-16T14:05:38Z)
Hyper-Convolutions via Implicit Kernels for Medical Imaging [18.98078260974008]
We present the textithyper-convolution, a novel building block that implicitly encodes the convolutional kernel using spatial coordinates. We demonstrate in our experiments that replacing regular convolutions with hyper-convolutions can improve performance with less parameters, and increase robustness against noise.
arXiv Detail & Related papers (2022-02-06T03:56:19Z)
Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models. This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models. We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z)
X-volution: On the unification of convolution and self-attention [52.80459687846842]
We propose a multi-branch elementary module composed of both convolution and self-attention operation. The proposed X-volution achieves highly competitive visual understanding improvements.
arXiv Detail & Related papers (2021-06-04T04:32:02Z)
Flow-based Kernel Prior with Application to Blind Super-Resolution [143.21527713002354]
Kernel estimation is generally one of the key problems for blind image super-resolution (SR) This paper proposes a normalizing flow-based kernel prior (FKP) for kernel modeling. Experiments on synthetic and real-world images demonstrate that the proposed FKP can significantly improve the kernel estimation accuracy.
arXiv Detail & Related papers (2021-03-29T22:37:06Z)
Revisiting Dynamic Convolution via Matrix Decomposition [81.89967403872147]
We propose dynamic channel fusion to replace dynamic attention over channel groups. Our method is easier to train and requires significantly fewer parameters without sacrificing accuracy.
arXiv Detail & Related papers (2021-03-15T23:03:18Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.