Related papers: Cross-Model Transfer of Task Vectors via Few-Shot Orthogonal Alignment

Cross-Model Transfer of Task Vectors via Few-Shot Orthogonal Alignment

URL: http://arxiv.org/abs/2505.12021v1
Date: Sat, 17 May 2025 14:24:06 GMT
Title: Cross-Model Transfer of Task Vectors via Few-Shot Orthogonal Alignment
Authors: Kazuhiko Kawamoto, Atsuhiro Endo, Hiroshi Kera,
Abstract summary: Task arithmetic enables efficient model editing by representing task-specific changes as vectors in parameter space.<n>This assumption limits its applicability in cross-model transfer settings, where models are independently pre-trained on different datasets.<n>We propose a method based on few-shot alignment, which aligns task vectors to the parameter space of a differently pre-trained target model.
Score: 5.2980803808373516
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Task arithmetic enables efficient model editing by representing task-specific changes as vectors in parameter space. Task arithmetic typically assumes that the source and target models are initialized from the same pre-trained parameters. This assumption limits its applicability in cross-model transfer settings, where models are independently pre-trained on different datasets. To address this challenge, we propose a method based on few-shot orthogonal alignment, which aligns task vectors to the parameter space of a differently pre-trained target model. These transformations preserve key properties of task vectors, such as norm and rank, and are learned using only a small number of labeled examples. We evaluate the method using two Vision Transformers pre-trained on YFCC100M and LAION400M, and test on eight classification datasets. Experimental results show that our method improves transfer accuracy over direct task vector application and achieves performance comparable to few-shot fine-tuning, while maintaining the modularity and reusability of task vectors. Our code is available at https://github.com/kawakera-lab/CrossModelTransfer.

Related papers

Efficient Model Editing with Task-Localized Sparse Fine-tuning [14.792099973449794]
We propose TaLoS which allows to build sparse task vectors with minimal interference without requiring explicit linearization.<n>We find that pre-trained models contain a subset of parameters with consistently low gradient sensitivity across tasks.<n>Our experiments prove that TaLoS improves training and inference efficiency while outperforming current methods in task addition and negation.
arXiv Detail & Related papers (2025-04-03T14:20:06Z)
Multi-Task Model Merging via Adaptive Weight Disentanglement [69.7292615212444]
We introduce an Adaptive Weight Disentanglement method for model merging.<n>We successfully extract redundant vectors, and after their subtraction, the task vectors retain robust performance.
arXiv Detail & Related papers (2024-11-27T20:08:55Z)
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling [51.4661186662329]
We introduce aTLAS, an algorithm that linearly combines parameter blocks with different learned coefficients, resulting in anisotropic scaling at the task vector level. We show that such linear combinations explicitly exploit the low intrinsicity of pre-trained models, with only a few coefficients being the learnable parameters. We demonstrate the effectiveness of our method in task arithmetic, few-shot recognition and test-time adaptation, with supervised or unsupervised objectives.
arXiv Detail & Related papers (2024-07-03T07:54:08Z)
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks. transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks. We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection. Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z)
Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios. We extend transferability metrics to object detection using ROI-Align and TLogME. We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z)
Editing Models with Task Arithmetic [69.97273155842966]
Changing how pre-trained models behave is a common practice when developing machine learning systems. We build task vectors by subtracting the weights of a pre-trained model from the weights of the same model after fine-tuning on a task. We show that these task vectors can be modified and combined together through arithmetic operations such as negation and addition.
arXiv Detail & Related papers (2022-12-08T05:50:53Z)
Aligning Pretraining for Detection via Object-Level Contrastive Learning [57.845286545603415]
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection.
arXiv Detail & Related papers (2021-06-04T17:59:52Z)
Parameter-Efficient Transfer Learning with Diff Pruning [108.03864629388404]
diff pruning is a simple approach to enable parameter-efficient transfer learning within the pretrain-finetune framework. We find that models finetuned with diff pruning can match the performance of fully finetuned baselines on the GLUE benchmark.
arXiv Detail & Related papers (2020-12-14T12:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.