Model Merging with Functional Dual Anchors
- URL: http://arxiv.org/abs/2510.21223v1
- Date: Fri, 24 Oct 2025 07:54:06 GMT
- Title: Model Merging with Functional Dual Anchors
- Authors: Kexuan Shi, Yandong Wen, Weiyang Liu,
- Abstract summary: Model merging is an efficient strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model.<n>We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space.<n>FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model.
- Score: 21.76214716818033
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model merging is an efficient post-training strategy for integrating knowledge from multiple finetuned checkpoints of a shared foundation model. Existing methods operate in the parameter space, combining task vectors to mitigate conflicts, but remain constrained by parameter inconsistencies. We propose Functional Dual Anchors (FDAs), a framework that instead models the input-representation space. FDAs are synthetic inputs whose induced gradients align with task vectors, capturing task-specific functional shifts relative to the pretrained model. This perspective bridges joint multi-task training and post-hoc merging, offering both robustness and flexibility. We further introduce a principled initialization scheme and show that FDAs are complementary to parameter-space model merging. Comprehensive experiments demonstrate the effectiveness of FDAs in model merging.
Related papers
- Model Merging in the Essential Subspace [78.5390284258307]
Model merging aims to integrate multiple task-specific fine-tuned models into a single multi-task model without additional training.<n>Despite extensive research, task interference remains a major obstacle that often undermines the performance of merged models.<n>We propose ESM (Essential Subspace Merging), a robust framework for effective model merging.
arXiv Detail & Related papers (2026-02-23T00:33:38Z) - Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging [38.12136955174922]
Fine-tuning large language models (LMs) for individual tasks yields strong performance but is expensive for deployment and storage.<n>Recent works explore model merging to combine multiple task-specific models into a single multi-task model without additional training.<n>Existing merging methods often fail for models fine-tuned with low-rank adaptation (LoRA), due to significant performance degradation.
arXiv Detail & Related papers (2025-05-28T23:28:12Z) - RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness [28.437105789298244]
RobustMerge is a training-free parameter-efficient merging method with complementary parameter adaptation to maintain direction robustness.<n>We establish a benchmark consisting of diverse multimodal tasks, on which we conduct experiments to certify the outstanding performance and generalizability of our method.
arXiv Detail & Related papers (2025-02-24T13:52:05Z) - Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains.<n>Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches.<n>We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z) - Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent [72.10987117380584]
Merging multiple expert models offers a promising approach for performing multi-task learning without accessing their original data.<n>We find existing methods discard task-specific information that, while causing conflicts, is crucial for performance.<n>Our approach consistently outperforms previous methods, achieving state-of-the-art results across diverse architectures and tasks in both vision and NLP domains.
arXiv Detail & Related papers (2025-01-02T12:45:21Z) - Training-Free Pretrained Model Merging [38.16269074353077]
We propose an innovative model merging framework, coined as merging under dual-space constraints (MuDSC)
In order to enhance usability, we have also incorporated adaptations for group structure, including Multi-Head Attention and Group Normalization.
arXiv Detail & Related papers (2024-03-04T06:19:27Z) - Merging by Matching Models in Task Parameter Subspaces [87.8712523378141]
Model merging aims to cheaply combine individual task-specific models into a single multitask model.
We formalize how this approach to model merging can be seen as solving a linear system of equations.
We show that using the conjugate gradient method can outperform closed-form solutions.
arXiv Detail & Related papers (2023-12-07T14:59:15Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - TIES-Merging: Resolving Interference When Merging Models [95.59265307318752]
Transfer learning can confer significant advantages, including improved downstream performance, faster convergence, and better sample efficiency.
Model merging has emerged as a solution to combine multiple task-specific models into a single model without performing additional training.
Existing merging methods often ignore the interference between parameters of different models, resulting in large performance drops when merging multiple models.
We propose TIES-Merging, which introduces three novel steps when merging models: resetting parameters that only changed a small amount during fine-tuning, resolving sign conflicts, and merging only the parameters that are in alignment with the final agreed-upon sign.
arXiv Detail & Related papers (2023-06-02T17:31:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.