Preserving Domain Generalization in Fine-Tuning via Joint Parameter Selection
- URL: http://arxiv.org/abs/2508.16976v1
- Date: Sat, 23 Aug 2025 10:00:45 GMT
- Title: Preserving Domain Generalization in Fine-Tuning via Joint Parameter Selection
- Authors: Bin Pan, Shiyu Shen, Zongbin Wang, Zhenwei Shi, Xia Xu,
- Abstract summary: We introduce Joint Selection (JPS), a novel method that restricts updates to a small subset of parameters, thereby retaining and harnessing the strength of pre-trained models.<n>JPS achieves superior performance compared to state-of-the-art domain generalization methods, substantiating both the efficiency and efficacy of the proposed approach.
- Score: 26.366275954455514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Domain generalization seeks to develop models trained on a limited set of source domains that are capable of generalizing effectively to unseen target domains. While the predominant approach leverages large-scale pre-trained vision models as initialization, recent studies have highlighted that full fine-tuning can compromise the intrinsic generalization capabilities of these models. To address this limitation, parameter-efficient adaptation strategies have emerged, wherein only a subset of model parameters is selectively fine-tuned, thereby balancing task adaptation with the preservation of generalization. Motivated by this paradigm, we introduce Joint Parameter Selection (JPS), a novel method that restricts updates to a small, sparse subset of parameters, thereby retaining and harnessing the generalization strength of pre-trained models. Theoretically, we establish a generalization error bound that explicitly accounts for the sparsity of parameter updates, thereby providing a principled justification for selective fine-tuning. Practically, we design a selection mechanism employing dual operators to identify and update parameters exhibiting consistent and significant gradients across all source domains. Extensive benchmark experiments demonstrate that JPS achieves superior performance compared to state-of-the-art domain generalization methods, substantiating both the efficiency and efficacy of the proposed approach.
Related papers
- DSP-Reg: Domain-Sensitive Parameter Regularization for Robust Domain Generalization [21.0252973774713]
Domain Generalization is a critical area that focuses on developing models capable of performing well on data from unseen distributions.<n>Existing approaches primarily concentrate on learning domain-invariant features, which assume that a model robust to variations in the source domains will generalize well to unseen target domains.<n>We propose Domain-Sensitive Regularization (DSP-Reg), a principled framework that guides model optimization by a soft regularization technique.
arXiv Detail & Related papers (2026-01-27T09:24:51Z) - Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z) - Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios [54.58186816693791]
environments constantly change over time and space, posing significant challenges for object detectors trained based on a closed-set assumption.<n>We propose a new mechanism, converting the fine-tuning process to a specific- parameter generation.<n>In particular, we first design a dual-path LoRA-based domain-aware adapter that disentangles features into domain-invariant and domain-specific components.
arXiv Detail & Related papers (2025-06-30T17:14:12Z) - Partial Transportability for Domain Generalization [56.37032680901525]
Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution.<n>Our contribution is to provide the first general estimation technique for transportability problems.<n>We propose a gradient-based optimization scheme for making scalable inferences in practice.
arXiv Detail & Related papers (2025-03-30T22:06:37Z) - Unsupervised Parameter Efficient Source-free Post-pretraining [52.27955794126508]
We introduce UpStep, an Unsupervised.<n>Source-free post-pretraining approach to adapt a base model from a source domain to a target domain.<n>We use various general backbone architectures, both supervised and unsupervised, trained on Imagenet as our base model.
arXiv Detail & Related papers (2025-02-28T18:54:51Z) - SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning [6.262268096839562]
Domain generalization aims to adapt a model using one or multiple source domains to ensure robust performance in unseen target domains.<n>Existing PEFT methods struggle to strike a balance between preserving generalizable components of the pre-trained model and learning task-specific features.<n>We introduce Singular Value De Minor Components Adaptation (SoMA), an approach that selectively tunes minor singular components while keeping the residual parts frozen.
arXiv Detail & Related papers (2024-12-05T11:17:57Z) - Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization [28.977757627384165]
Domain Domain (DG) aims to avoid the performance degradation of the model when the distribution shift between the limited training data and unseen test data occurs.
Recently, foundation models with enormous parameters have been pre-trained with huge datasets, demonstrating strong generalization ability.
Our framework achieves SOTA performance on five DG benchmarks, while only requiring training a small number of parameters without adding additional testing cost.
arXiv Detail & Related papers (2024-07-21T07:50:49Z) - Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work.
Our empirical investigation includes tens of thousands of models trained with all combinations of threes.
We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z) - Systematic Analysis for Pretrained Language Model Priming for Parameter-Efficient Fine-tuning [45.99877631719761]
We propose a general PE priming framework to enhance and explore the few-shot adaptation and generalization ability of PE methods.
We conduct experiments on a few-shot cross-domain benchmark containing 160 diverse NLP tasks.
arXiv Detail & Related papers (2022-12-02T08:56:53Z) - Learning to Learn Domain-invariant Parameters for Domain Generalization [29.821634033299855]
Domain generalization (DG) aims to overcome this issue by capturing domain-invariant representations from source domains.
We propose two modules of Domain Decoupling and Combination (DDC) and Domain-invariance-guided Backpropagation (DIGB)
Our proposed method has achieved state-of-the-art performance with strong generalization capability.
arXiv Detail & Related papers (2022-11-04T07:19:34Z) - Variational Model Perturbation for Source-Free Domain Adaptation [64.98560348412518]
We introduce perturbations into the model parameters by variational Bayesian inference in a probabilistic framework.
We demonstrate the theoretical connection to learning Bayesian neural networks, which proves the generalizability of the perturbed model to target domains.
arXiv Detail & Related papers (2022-10-19T08:41:19Z) - Improving Hyperparameter Optimization by Planning Ahead [3.8673630752805432]
We propose a novel transfer learning approach, defined within the context of model-based reinforcement learning.
We propose a new variant of model predictive control which employs a simple look-ahead strategy as a policy.
Our experiments on three meta-datasets comparing to state-of-the-art HPO algorithms show that the proposed method can outperform all baselines.
arXiv Detail & Related papers (2021-10-15T11:46:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.