Related papers: Variational Task Vector Composition

Variational Task Vector Composition

URL: http://arxiv.org/abs/2509.18208v1
Date: Sun, 21 Sep 2025 02:46:02 GMT
Title: Variational Task Vector Composition
Authors: Boyuan Zhang, Yingjun Du, Xiantong Zhen, Ling Shao,
Abstract summary: We propose variational task vector composition, where composition coefficients are taken as latent variables and estimated in a Bayesian inference framework.<n>Motivated by the observation of structural redundancy in task vectors, we introduce a Spike-and-Slab prior that promotes sparsity.<n>We develop a gated sampling mechanism that constructs a controllable posterior by filtering the composition coefficients based on both uncertainty and importance.
Score: 53.476598858325985
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Task vectors capture how a model changes during fine-tuning by recording the difference between pre-trained and task-specific weights. The composition of task vectors, a key operator in task arithmetic, enables models to integrate knowledge from multiple tasks without incurring additional inference costs. In this paper, we propose variational task vector composition, where composition coefficients are taken as latent variables and estimated in a Bayesian inference framework. Unlike previous methods that operate at the task level, our framework focuses on sample-specific composition. Motivated by the observation of structural redundancy in task vectors, we introduce a Spike-and-Slab prior that promotes sparsity and preserves only the most informative components. To further address the high variance and sampling inefficiency in sparse, high-dimensional spaces, we develop a gated sampling mechanism that constructs a controllable posterior by filtering the composition coefficients based on both uncertainty and importance. This yields a more stable and interpretable variational framework by deterministically selecting reliable task components, reducing sampling variance while improving transparency and generalization. Experimental results demonstrate that our method consistently outperforms existing approaches across all datasets by selectively leveraging the most reliable and informative components in task vectors. These findings highlight the practical value of our approach, establishing a new standard for efficient and effective task vector composition.

Related papers

Decomposing Task Vectors for Refined Model Editing [21.799465464971092]
We propose a principled decomposition method that separates each task vector into two components.<n>By identifying invariant subspaces across projections, our approach enables more precise control over concept manipulation.
arXiv Detail & Related papers (2025-12-27T07:53:44Z)
Purifying Task Vectors in Knowledge-Aware Subspace for Model Merging [83.5273168208788]
Model merging aims to integrate task-specific abilities from individually fine-tuned models into a single model without extra training.<n>The merged model often suffers from notable performance degradation due to the conflicts caused by task-irrelevant redundancy in task vectors.<n>We propose Purifying TAsk Vectors (PAVE) in knowledge-aware subspace to overcome these challenges.
arXiv Detail & Related papers (2025-10-16T14:02:57Z)
Reverse Probing: Evaluating Knowledge Transfer via Finetuned Task Embeddings for Coreference Resolution [23.375053899418504]
Instead of probing frozen representations from a complex source task, we explore the effectiveness of embeddings from multiple simple source tasks on a single target task.<n>Our findings reveal that task embeddings vary significantly in utility for coreference resolution, with semantic similarity tasks proving most beneficial.
arXiv Detail & Related papers (2025-01-31T17:12:53Z)
Revisiting Weight Averaging for Model Merging [16.503826062785773]
Model merging aims to build a multi-task learner by combining the parameters of individually fine-tuned models without additional training.<n>Weight averaging implicitly induces task vectors centered around the weight averaging itself.<n>Applying a low-rank approximation to these centered task vectors significantly improves merging performance.
arXiv Detail & Related papers (2024-12-11T06:29:20Z)
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling [51.4661186662329]
We introduce aTLAS, an algorithm that linearly combines parameter blocks with different learned coefficients, resulting in anisotropic scaling at the task vector level. We show that such linear combinations explicitly exploit the low intrinsicity of pre-trained models, with only a few coefficients being the learnable parameters. We demonstrate the effectiveness of our method in task arithmetic, few-shot recognition and test-time adaptation, with supervised or unsupervised objectives.
arXiv Detail & Related papers (2024-07-03T07:54:08Z)
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning [101.81127587760831]
Current fine-tuning methods build adapters widely of the context of downstream task to learn, or the context of important knowledge to maintain.<n>We propose CorDA, a Context-oriented Decomposition Adaptation method that builds learnable task-aware adapters.<n>Our method enables two options, the knowledge-preserved adaptation and the instruction-previewed adaptation.
arXiv Detail & Related papers (2024-06-07T19:10:35Z)
ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple yet General Complementary Transformer [71.82644727907146]
We propose a novel $underlineComP$lementary $underlinetr$ansformer, $textbfComPtr$, for diverse bi-source dense prediction tasks.<n>ComPtr treats different inputs equally and builds an efficient dense interaction model in the form of sequence-to-sequence on top of the transformer.
arXiv Detail & Related papers (2023-07-23T15:17:45Z)
Automated Concatenation of Embeddings for Structured Prediction [75.44925576268052]
We propose Automated Concatenation of Embeddings (ACE) to automate the process of finding better concatenations of embeddings for structured prediction tasks. We follow strategies in reinforcement learning to optimize the parameters of the controller and compute the reward based on the accuracy of a task model.
arXiv Detail & Related papers (2020-10-10T14:03:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.