Related papers: Task Vector Bases: A Unified and Scalable Framework for Compressed Task Arithmetic

Task Vector Bases: A Unified and Scalable Framework for Compressed Task Arithmetic

URL: http://arxiv.org/abs/2502.01015v4
Date: Thu, 09 Oct 2025 05:18:04 GMT
Title: Task Vector Bases: A Unified and Scalable Framework for Compressed Task Arithmetic
Authors: Siqi Zeng, Yifei He, Meitong Liu, Weiqiu You, Yifan Hao, Yao-Hung Hubert Tsai, Makoto Yamada, Han Zhao,
Abstract summary: We propose Task Vector Bases, a framework compressing $T$ task vectors into $M T$ basis vectors while preserving the functionality of task arithmetic.<n>By representing each task vector as a structured linear combination of basis atoms, our approach supports standard operations such as addition, negation, as well as more advanced arithmetic ones.
Score: 24.40854328492979
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Task arithmetic, representing downstream tasks through linear operations on task vectors, has emerged as a simple yet powerful paradigm for transferring knowledge across diverse settings. However, maintaining a large collection of task vectors introduces scalability challenges in both storage and computation. We propose Task Vector Bases, a framework compressing $T$ task vectors into $M < T$ basis vectors while preserving the functionality of task arithmetic. By representing each task vector as a structured linear combination of basis atoms, our approach supports standard operations such as addition, negation, as well as more advanced arithmetic ones. The framework is orthogonal to other efficiency-oriented improvements in task arithmetic and can be used in combination with them. We provide theoretical analysis showing that basis compression retains addition generalization guarantees and enables principled unlearning, with error bounds depending on reconstruction quality. Empirically, our proposed basis construction methods consistently outperform heuristic basis construction baselines and, in some cases, even surpass the performance of full task vector collections across diverse downstream applications while reducing storage and computational requirements. The code is available at https://github.com/uiuctml/TaskVectorBasis.

Related papers

Escaping Optimization Stagnation: Taking Steps Beyond Task Arithmetic via Difference Vectors [7.805099851866648]
Current methods for editing pre-trained models face significant challenges, primarily high computational costs and limited scalability.<n> Task arithmetic has recently emerged as a promising solution, using simple arithmetic operations-addition and negation-based on task vectors.<n>We propose Difference Vector-based Anisotropic Scaling Iterative algorithm (DV-BASI) to enable a continuous optimization process for task arithmetic methods.
arXiv Detail & Related papers (2025-11-22T09:01:05Z)
Purifying Task Vectors in Knowledge-Aware Subspace for Model Merging [83.5273168208788]
Model merging aims to integrate task-specific abilities from individually fine-tuned models into a single model without extra training.<n>The merged model often suffers from notable performance degradation due to the conflicts caused by task-irrelevant redundancy in task vectors.<n>We propose Purifying TAsk Vectors (PAVE) in knowledge-aware subspace to overcome these challenges.
arXiv Detail & Related papers (2025-10-16T14:02:57Z)
Variational Task Vector Composition [53.476598858325985]
We propose variational task vector composition, where composition coefficients are taken as latent variables and estimated in a Bayesian inference framework.<n>Motivated by the observation of structural redundancy in task vectors, we introduce a Spike-and-Slab prior that promotes sparsity.<n>We develop a gated sampling mechanism that constructs a controllable posterior by filtering the composition coefficients based on both uncertainty and importance.
arXiv Detail & Related papers (2025-09-21T02:46:02Z)
On Fairness of Task Arithmetic: The Role of Task Vectors [1.236974227340167]
We study how manipulating task vectors affects fairness metrics, including Demographic Parity and Equalized Odds.<n>Our results offer novel insights into the fairness implications of model editing and establish a foundation for fairness-aware and responsible model editing practices.
arXiv Detail & Related papers (2025-05-30T06:34:01Z)
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers [64.1656365676171]
Task arithmetic refers to editing the pre-trained model by adding a weighted sum of task vectors. This paper theoretically prove the effectiveness of task addition in simultaneously learning a set of irrelevant or irrelevant tasks. We prove the proper selection for task arithmetic to achieve negation to out-of-domain tasks.
arXiv Detail & Related papers (2025-04-15T08:04:39Z)
Efficient Model Editing with Task-Localized Sparse Fine-tuning [14.792099973449794]
We propose TaLoS which allows to build sparse task vectors with minimal interference without requiring explicit linearization. We find that pre-trained models contain a subset of parameters with consistently low gradient sensitivity across tasks. Our experiments prove that TaLoS improves training and inference efficiency while outperforming current methods in task addition and negation.
arXiv Detail & Related papers (2025-04-03T14:20:06Z)
Multi-Task Model Merging via Adaptive Weight Disentanglement [69.7292615212444]
We introduce an Adaptive Weight Disentanglement method for model merging.<n>We successfully extract redundant vectors, and after their subtraction, the task vectors retain robust performance.
arXiv Detail & Related papers (2024-11-27T20:08:55Z)
Task Arithmetic Through The Lens Of One-Shot Federated Learning [3.8230727103887943]
Task Arithmetic is a model merging technique that enables the combination of multiple models' capabilities into a single model.<n>We show that Task Arithmetic is mathematically equivalent to the commonly used algorithm in Federated Learning.<n>We adapt several algorithms from Federated Learning to improve the effectiveness of Task Arithmetic.
arXiv Detail & Related papers (2024-11-27T18:53:41Z)
Self-Attention Based Semantic Decomposition in Vector Symbolic Architectures [6.473177443214531]
We introduce a new variant of the resonator network, based on self-attention based update rules in iterative search problem. Our algorithm enables a larger capacity for associative memory, enabling applications in many tasks like perception based pattern recognition, scene decomposition, and object reasoning.
arXiv Detail & Related papers (2024-03-20T00:37:19Z)
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models [96.9373147383119]
We show that weight disentanglement is the crucial factor that makes task arithmetic effective. We show that fine-tuning models in their tangent space by linearizing them amplifies weight disentanglement. This leads to substantial performance improvements across task arithmetic benchmarks and diverse models.
arXiv Detail & Related papers (2023-05-22T08:39:25Z)
Editing Models with Task Arithmetic [69.97273155842966]
Changing how pre-trained models behave is a common practice when developing machine learning systems. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition.
arXiv Detail & Related papers (2022-12-08T05:50:53Z)
An Algorithm for Routing Vectors in Sequences [0.0]
We propose a routing algorithm that takes a sequence of vectors and computes a new sequence with specified length and vector size. Each output vector maximizes "bang per bit," the difference between a net benefit to use and net cost to ignore data, by better predicting the input vectors.
arXiv Detail & Related papers (2022-11-20T16:20:45Z)
Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph. Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks. Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z)
Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z)
Parameter-Efficient Transfer Learning with Diff Pruning [108.03864629388404]
diff pruning is a simple approach to enable parameter-efficient transfer learning within the pretrain-finetune framework. We find that models finetuned with diff pruning can match the performance of fully finetuned baselines on the GLUE benchmark.
arXiv Detail & Related papers (2020-12-14T12:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.