Related papers: Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models

Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models

URL: http://arxiv.org/abs/2504.08915v2
Date: Sun, 09 Nov 2025 01:32:03 GMT
Title: Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models
Authors: Jiahuan Long, Tingsong Jiang, Wen Yao, Yizhe Xiong, Zhengqin Xu, Shuai Jia, Hanqing Liu, Chao Ma,
Abstract summary: In this paper, we investigate redundancies in the segment anything model (SAM) and then propose a novel parameter-free fine-tuning method.<n>Unlike traditional fine-tuning methods that adjust parameters, our method emphasizes selecting, reusing, and enhancing pre-trained features.<n> Experiments on both out-of-domain and in-domain datasets demonstrate the efficiency and effectiveness of our method.
Score: 29.977749265185917
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision foundation models (VFMs) have demonstrated remarkable capabilities in learning universal visual representations. However, adapting these models to downstream tasks conventionally requires parameter updates, with even parameter-efficient fine-tuning methods necessitating the modification of thousands to millions of weights. In this paper, we investigate the redundancies in the segment anything model (SAM) and then propose a novel parameter-free fine-tuning method. Unlike traditional fine-tuning methods that adjust parameters, our method emphasizes selecting, reusing, and enhancing pre-trained features, offering a new perspective on fine-tuning foundation models. Specifically, we introduce a channel selection algorithm based on the model's output difference to identify redundant and effective channels. By selectively replacing the redundant channels with more effective ones, we filter out less useful features and reuse more task-irrelevant features to downstream tasks, thereby enhancing the task-specific feature representation. Experiments on both out-of-domain and in-domain datasets demonstrate the efficiency and effectiveness of our method in different vision tasks (e.g., image segmentation, depth estimation and image classification). Notably, our approach can seamlessly integrate with existing fine-tuning strategies (e.g., LoRA, Adapter), further boosting the performance of already fine-tuned models. Moreover, since our channel selection involves only model inference, our method significantly reduces GPU memory overhead.

Related papers

Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z)
Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer [17.463052541838504]
Fine-tuned models often struggle outside their specific domains and exhibit considerable redundancy.<n>Recent studies suggest that combining a pruned fine-tuned model with the original pre-trained model can mitigate interference when merging model parameters across tasks.<n>We introduce a novel method called Neural Pruning (NPS-Pruning) for slimming down fine-tuned models.
arXiv Detail & Related papers (2025-05-24T14:27:20Z)
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning [6.110846759317336]
Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Research efforts have been made on merging these large models into a single multi-task model, particularly with simple arithmetic on parameters. Such merging methodology faces a central challenge: interference between model parameters fine-tuned on different tasks. We propose to fine-tune pre-trained models via sharpness-aware minimization.
arXiv Detail & Related papers (2025-04-20T15:57:12Z)
Unsupervised Parameter Efficient Source-free Post-pretraining [52.27955794126508]
We introduce UpStep, an Unsupervised. Source-free post-pretraining approach to adapt a base model from a source domain to a target domain. We use various general backbone architectures, both supervised and unsupervised, trained on Imagenet as our base model.
arXiv Detail & Related papers (2025-02-28T18:54:51Z)
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation [17.39117429338763]
We propose CoPA-Merging, a training-free parameter efficient merging method with complementary parameter adaptation.<n>We establish a benchmark consisting of diverse multimodal tasks, on which we conduct experiments to certificate the outstanding performance and generalizability of our method.
arXiv Detail & Related papers (2025-02-24T13:52:05Z)
Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent [74.02034188307857]
Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data. We find existing methods inevitably discard task-specific information that, while causing conflicts, is crucial for performance. Our approach consistently outperforms previous methods, achieving state-of-the-art results across diverse architectures and tasks in both vision and NLP domains.
arXiv Detail & Related papers (2025-01-02T12:45:21Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.<n>Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.<n>We propose a novel model fine-tuning method to make full use of these ineffective parameters.<n>Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z)
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach [17.678759882763078]
Fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks. Striking a balance between retaining the generalizable representation capacity of the pre-trained model and acquiring task-specific features is a key challenge. We propose a Residual-based Low-Rank Rescaling (RLRR) fine-tuning strategy.
arXiv Detail & Related papers (2024-03-28T00:14:53Z)
E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation. Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z)
TIES-Merging: Resolving Interference When Merging Models [95.59265307318752]
Transfer learning can confer significant advantages, including improved downstream performance, faster convergence, and better sample efficiency. Model merging has emerged as a solution to combine multiple task-specific models into a single model without performing additional training. Existing merging methods often ignore the interference between parameters of different models, resulting in large performance drops when merging multiple models. We propose TIES-Merging, which introduces three novel steps when merging models: resetting parameters that only changed a small amount during fine-tuning, resolving sign conflicts, and merging only the parameters that are in alignment with the final agreed-upon sign.
arXiv Detail & Related papers (2023-06-02T17:31:32Z)
EnfoMax: Domain Entropy and Mutual Information Maximization for Domain Generalized Face Anti-spoofing [0.0]
Face anti-spoofing (FAS) method performs well under intra-domain setups. The domain generalization (DG) method has gained more attention in FAS. This paper proposes the EnfoMax framework, which uses information theory to analyze cross-domain FAS tasks.
arXiv Detail & Related papers (2023-02-17T03:54:18Z)
Parameter-efficient Model Adaptation for Vision Transformers [45.3460867776953]
We study parameter-efficient model adaptation strategies for vision transformers on the image classification task. We propose a parameter-efficient model adaptation framework, which first selects submodules by measuring local intrinsic dimensions. Our method performs the best in terms of the tradeoff between accuracy and parameter efficiency across 20 image classification datasets.
arXiv Detail & Related papers (2022-03-29T05:30:09Z)
Towards a Unified View of Parameter-Efficient Transfer Learning [108.94786930869473]
Fine-tuning large pre-trained language models on downstream tasks has become the de-facto learning paradigm in NLP. Recent work has proposed a variety of parameter-efficient transfer learning methods that only fine-tune a small number of (extra) parameters to attain strong performance. We break down the design of state-of-the-art parameter-efficient transfer learning methods and present a unified framework that establishes connections between them.
arXiv Detail & Related papers (2021-10-08T20:22:26Z)
Operation-Aware Soft Channel Pruning using Differentiable Masks [51.04085547997066]
We propose a data-driven algorithm, which compresses deep neural networks in a differentiable way by exploiting the characteristics of operations. We perform extensive experiments and achieve outstanding performance in terms of the accuracy of output networks.
arXiv Detail & Related papers (2020-07-08T07:44:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.