Large Convolutional Model Tuning via Filter Subspace
- URL: http://arxiv.org/abs/2403.00269v3
- Date: Tue, 25 Feb 2025 21:42:28 GMT
- Title: Large Convolutional Model Tuning via Filter Subspace
- Authors: Wei Chen, Zichen Miao, Qiang Qiu,
- Abstract summary: We propose to fine-tune pre-trained models by adjusting only filter atoms, which are responsible for spatial-only convolution.<n>We show that such a simple scheme surpasses previous tuning baselines for both discriminate and generative tasks.
- Score: 28.223665047553016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient fine-tuning methods are critical to address the high computational and parameter complexity while adapting large pre-trained models to downstream tasks. Our study is inspired by prior research that represents each convolution filter as a linear combination of a small set of filter subspace elements, referred to as filter atoms. In this paper, we propose to fine-tune pre-trained models by adjusting only filter atoms, which are responsible for spatial-only convolution, while preserving spatially-invariant channel combination knowledge in atom coefficients. In this way, we bring a new filter subspace view for model tuning. Furthermore, each filter atom can be recursively decomposed as a combination of another set of atoms, which naturally expands the number of tunable parameters in the filter subspace. By only adapting filter atoms constructed by a small number of parameters, while maintaining the rest of model parameters constant, the proposed approach is highly parameter-efficient. It effectively preserves the capabilities of pre-trained models and prevents overfitting to downstream tasks. Extensive experiments show that such a simple scheme surpasses previous tuning baselines for both discriminate and generative tasks.
Related papers
- Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models [28.223665047553016]
Transformer-based large pre-trained models have shown remarkable generalization ability.
Various parameter-efficient fine-tuning (PEFT) methods have been proposed to customize these models on downstream tasks with minimal computational and memory budgets.
In this paper, we propose to tune the large pre-trained transformers by learning a small set of combination coefficients that construct a more expressive filter subspace.
arXiv Detail & Related papers (2025-03-24T04:42:40Z) - Extra Clients at No Extra Cost: Overcome Data Heterogeneity in Federated Learning with Filter Decomposition [25.658632928800962]
We propose a technique for decomposing a convolutional filter in federated learning (FL) into a linear combination of filter subspace elements.
This simple technique transforms global filter aggregation in FL into aggregating filter atoms and their atom coefficients.
Empirical results on benchmark datasets demonstrate that our filter decomposition technique substantially improves the accuracy of FL methods.
arXiv Detail & Related papers (2025-03-11T17:42:36Z) - Learning Differentiable Particle Filter on the Fly [18.466658684464598]
Differentiable particle filters are an emerging class of sequential Bayesian inference techniques.
We propose an online learning framework for differentiable particle filters so that model parameters can be updated as data arrive.
arXiv Detail & Related papers (2023-12-10T17:54:40Z) - Implicit Maximum a Posteriori Filtering via Adaptive Optimization [4.767884267554628]
We frame the standard Bayesian filtering problem as optimization over a time-varying objective.
We show that our framework results in filters that are effective, robust, and scalable to high-dimensional systems.
arXiv Detail & Related papers (2023-11-17T15:30:44Z) - Memory-efficient particle filter recurrent neural network for object
localization [53.68402839500528]
This study proposes a novel memory-efficient recurrent neural network (RNN) architecture specified to solve the object localization problem.
We take the idea of the classical particle filter and combine it with GRU RNN architecture.
In our experiments, the mePFRNN model provides more precise localization than the considered competitors and requires fewer trained parameters.
arXiv Detail & Related papers (2023-10-02T19:41:19Z) - As large as it gets: Learning infinitely large Filters via Neural Implicit Functions in the Fourier Domain [22.512062422338914]
Recent work in neural networks for image classification has seen a strong tendency towards increasing the spatial context.
We propose a module for studying the effective filter size of convolutional neural networks.
Our analysis shows that, although the proposed networks could learn very large convolution kernels, the learned filters are well localized and relatively small in practice.
arXiv Detail & Related papers (2023-07-19T14:21:11Z) - Filter Pruning for Efficient CNNs via Knowledge-driven Differential
Filter Sampler [103.97487121678276]
Filter pruning simultaneously accelerates the computation and reduces the memory overhead of CNNs.
We propose a novel Knowledge-driven Differential Filter Sampler(KDFS) with Masked Filter Modeling(MFM) framework for filter pruning.
arXiv Detail & Related papers (2023-07-01T02:28:41Z) - Computational Doob's h-transforms for Online Filtering of Discretely
Observed Diffusions [65.74069050283998]
We propose a computational framework to approximate Doob's $h$-transforms.
The proposed approach can be orders of magnitude more efficient than state-of-the-art particle filters.
arXiv Detail & Related papers (2022-06-07T15:03:05Z) - Deep Learning for the Benes Filter [91.3755431537592]
We present a new numerical method based on the mesh-free neural network representation of the density of the solution of the Benes model.
We discuss the role of nonlinearity in the filtering model equations for the choice of the domain of the neural network.
arXiv Detail & Related papers (2022-03-09T14:08:38Z) - Direct design of biquad filter cascades with deep learning by sampling
random polynomials [5.1118282767275005]
In this work, we learn a direct mapping from the target magnitude response to the filter coefficient space with a neural network trained on millions of random filters.
We demonstrate our approach enables both fast and accurate estimation of filter coefficients given a desired response.
We compare our method against existing methods including modified Yule-Walker and gradient descent and show IIRNet is, on average, both faster and more accurate.
arXiv Detail & Related papers (2021-10-07T17:58:08Z) - Learning Versatile Convolution Filters for Efficient Visual Recognition [125.34595948003745]
This paper introduces versatile filters to construct efficient convolutional neural networks.
We conduct theoretical analysis on network complexity and an efficient convolution scheme is introduced.
Experimental results on benchmark datasets and neural networks demonstrate that our versatile filters are able to achieve comparable accuracy as that of original filters.
arXiv Detail & Related papers (2021-09-20T06:07:14Z) - Adaptive Convolutions with Per-pixel Dynamic Filter Atom [24.691793951360914]
We introduce scalable dynamic convolutions with per-pixel adapted filters.
As plug-and-play replacements to convolutional layers, the introduced adaptive convolutions with per-pixel dynamic atoms enable explicit modeling of intra-image variance.
We present experiments to show that, the proposed method delivers comparable or even better performance across tasks.
arXiv Detail & Related papers (2021-08-17T22:04:10Z) - Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost.
Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors.
We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.