Related papers: Will it Merge? On The Causes of Model Mergeability

Will it Merge? On The Causes of Model Mergeability

URL: http://arxiv.org/abs/2601.06672v1
Date: Sat, 10 Jan 2026 20:12:25 GMT
Title: Will it Merge? On The Causes of Model Mergeability
Authors: Adir Rahamim, Asaf Yehudai, Boaz Carmeli, Leshem Choshen, Yosi Mass, Yonatan Belinkov,
Abstract summary: We investigate why specific models are merged better than others.<n>We highlight the base model knowledge as a dominant factor.<n>Based on our mergeability definition, we explore a simple weighted merging technique.
Score: 53.26238805048332
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model merging has emerged as a promising technique for combining multiple fine-tuned models into a single multitask model without retraining. However, the factors that determine whether merging will succeed or fail remain poorly understood. In this work, we investigate why specific models are merged better than others. To do so, we propose a concrete, measurable definition of mergeability. We investigate several potential causes for high or low mergeability, highlighting the base model knowledge as a dominant factor: Models fine-tuned on instances that the base model knows better are more mergeable than models fine-tuned on instances that the base model struggles with. Based on our mergeability definition, we explore a simple weighted merging technique that better preserves weak knowledge in the base model.

Related papers

Towards Reversible Model Merging For Low-rank Weights [5.100622189286672]
Model merging aims to combine multiple fine-tuned models into a single set of weights that performs well across all source tasks.<n>We show that applying conventional merging methods to low-rank weights leads to severe performance degradation in the merged model.<n>We propose a fundamentally different approach: instead of collapsing all adapters into one set of weights, we construct a compact basis.<n>This reframes merging as generating a reconstruction-capable model space rather than producing a single merged model.
arXiv Detail & Related papers (2025-10-15T23:22:38Z)
Model Unmerging: Making Your Models Unmergeable for Secure Model Sharing [47.204542615541364]
Unauthorized merging may infringe on developers' rights and risk leaking sensitive personal information.<n>We propose MergeLock, an active protection mechanism that disrupts model parameters to render them unmergeable.<n>Experiments demonstrate that MergeLock can degrade the performance of merged models by over 95% when a protected model is involved.
arXiv Detail & Related papers (2025-09-01T15:24:41Z)
Why Do More Experts Fail? A Theoretical Analysis of Model Merging [51.18155031364046]
Model merging dramatically reduces storage and computational resources by combining multiple expert models into a single multi-task model.<n>Recent model merging methods have shown promising results, but struggle to maintain performance gains as the number of merged models increases.<n>We show that the limited effective parameter space imposes a strict constraint on the number of models that can be successfully merged.
arXiv Detail & Related papers (2025-05-27T14:10:46Z)
Multi-Level Collaboration in Model Merging [56.31088116526825]
This paper explores the intrinsic connections between model merging and model ensembling.<n>We find that even when previous restrictions are not met, there is still a way for model merging to attain a near-identical and superior performance similar to that of ensembling.
arXiv Detail & Related papers (2025-03-03T07:45:04Z)
Exploring Model Kinship for Merging Large Language Models [73.98345036483299]
We study model evolution through iterative merging, drawing an analogy to biological evolution.<n>We show that model kinship is closely linked to the performance improvements achieved by merging.<n>We propose a new model merging strategy: Top-k Greedy Merging with Model Kinship.
arXiv Detail & Related papers (2024-10-16T14:29:29Z)
What Matters for Model Merging at Scale? [94.26607564817786]
Model merging aims to combine multiple expert models into a more capable single model. Previous studies have primarily focused on merging a few small models. This study systematically evaluates the utility of model merging at scale.
arXiv Detail & Related papers (2024-10-04T17:17:19Z)
EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods. EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.