DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
- URL: http://arxiv.org/abs/2504.12364v1
- Date: Wed, 16 Apr 2025 15:09:45 GMT
- Title: DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging
- Authors: Tianhui Song, Weixin Feng, Shuai Wang, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang,
- Abstract summary: We introduce a style-promptable image generation pipeline which can accurately generate arbitrary-style images under the control of style vectors.<n>Based on this design, we propose the score distillation based model merging paradigm (DMM), compressing multiple models into a single versatile T2I model.<n>Our experiments demonstrate that DMM can compactly reorganize the knowledge from multiple teacher models and achieve controllable arbitrary-style generation.
- Score: 32.97010533998294
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The success of text-to-image (T2I) generation models has spurred a proliferation of numerous model checkpoints fine-tuned from the same base model on various specialized datasets. This overwhelming specialized model production introduces new challenges for high parameter redundancy and huge storage cost, thereby necessitating the development of effective methods to consolidate and unify the capabilities of diverse powerful models into a single one. A common practice in model merging adopts static linear interpolation in the parameter space to achieve the goal of style mixing. However, it neglects the features of T2I generation task that numerous distinct models cover sundry styles which may lead to incompatibility and confusion in the merged model. To address this issue, we introduce a style-promptable image generation pipeline which can accurately generate arbitrary-style images under the control of style vectors. Based on this design, we propose the score distillation based model merging paradigm (DMM), compressing multiple models into a single versatile T2I model. Moreover, we rethink and reformulate the model merging task in the context of T2I generation, by presenting new merging goals and evaluation protocols. Our experiments demonstrate that DMM can compactly reorganize the knowledge from multiple teacher models and achieve controllable arbitrary-style generation.
Related papers
- AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization [86.8133939108057]
We propose AdaMMS, a novel model merging method tailored for heterogeneous MLLMs.<n>Our method tackles the challenges in three steps: mapping, merging and searching.<n>As the first model merging method capable of merging heterogeneous MLLMs without labeled data, AdaMMS outperforms previous model merging methods on various vision-language benchmarks.
arXiv Detail & Related papers (2025-03-31T05:13:02Z) - Model Assembly Learning with Heterogeneous Layer Weight Merging [57.8462476398611]
We introduce Model Assembly Learning (MAL), a novel paradigm for model merging.<n>MAL integrates parameters from diverse models in an open-ended model zoo to enhance the base model's capabilities.
arXiv Detail & Related papers (2025-03-27T16:21:53Z) - UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing [59.590505989071175]
Text-to-Image (T2I) diffusion models have shown impressive results in generating visually compelling images following user prompts.<n>We introduce UniVG, a generalist diffusion model capable of supporting a diverse range of image generation tasks with a single set of weights.
arXiv Detail & Related papers (2025-03-16T21:11:25Z) - Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy [0.0]
Model merging is a technique that combines multiple finetuned models into a single model without additional training.
Existing methods such as model watermarking or fingerprinting can only detect merging in hindsight.
We propose a first proactive defense against model merging.
arXiv Detail & Related papers (2025-03-08T06:08:47Z) - Exploring Model Kinship for Merging Large Language Models [52.01652098827454]
We introduce model kinship, the degree of similarity or relatedness between Large Language Models.
We find that there is a certain relationship between model kinship and the performance gains after model merging.
We propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets.
arXiv Detail & Related papers (2024-10-16T14:29:29Z) - PLeaS -- Merging Models with Permutations and Least Squares [43.17620198572947]
We propose a new two-step algorithm to merge models -- termed PLeaS -- which relaxes constraints.<n>PLeaS partially matches nodes in each layer by maximizing alignment.<n>We also demonstrate how our method can be extended to address a challenging scenario where no data is available from the fine-tuning domains.
arXiv Detail & Related papers (2024-07-02T17:24:04Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models [34.611309081801345]
Large diffusion-based Text-to-Image (T2I) models have shown impressive generative powers for text-to-image generation.
In this paper, we propose a novel strategy to scale a generative model across new tasks with minimal compute.
arXiv Detail & Related papers (2024-04-15T17:55:56Z) - Training-Free Pretrained Model Merging [38.16269074353077]
We propose an innovative model merging framework, coined as merging under dual-space constraints (MuDSC)
In order to enhance usability, we have also incorporated adaptations for group structure, including Multi-Head Attention and Group Normalization.
arXiv Detail & Related papers (2024-03-04T06:19:27Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.