GAN Cocktail: mixing GANs without dataset access
- URL: http://arxiv.org/abs/2106.03847v1
- Date: Mon, 7 Jun 2021 17:59:04 GMT
- Title: GAN Cocktail: mixing GANs without dataset access
- Authors: Omri Avrahami, Dani Lischinski, Ohad Fried
- Abstract summary: We tackle the problem of model merging, given two constraints that often come up in the real world.
In the first stage, we transform the weights of all the models to the same parameter space by a technique we term model rooting.
In the second stage, we merge the rooted models by averaging their weights and fine-tuning them for each specific domain, using only data generated by the original trained models.
- Score: 18.664733153082146
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Today's generative models are capable of synthesizing high-fidelity images,
but each model specializes on a specific target domain. This raises the need
for model merging: combining two or more pretrained generative models into a
single unified one. In this work we tackle the problem of model merging, given
two constraints that often come up in the real world: (1) no access to the
original training data, and (2) without increasing the size of the neural
network. To the best of our knowledge, model merging under these constraints
has not been studied thus far. We propose a novel, two-stage solution. In the
first stage, we transform the weights of all the models to the same parameter
space by a technique we term model rooting. In the second stage, we merge the
rooted models by averaging their weights and fine-tuning them for each specific
domain, using only data generated by the original trained models. We
demonstrate that our approach is superior to baseline methods and to existing
transfer learning techniques, and investigate several applications.
Related papers
- Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - Training-Free Model Merging for Multi-target Domain Adaptation [6.00960357022946]
We study multi-target domain adaptation of scene understanding models.
Our solution involves two components, merging model parameters and merging model buffers.
Our method is simple yet effective, achieving comparable performance with data combination training baselines.
arXiv Detail & Related papers (2024-07-18T17:59:57Z) - PLeaS -- Merging Models with Permutations and Least Squares [43.17620198572947]
We propose a new two-step algorithm to merge models-termed PLeaS.
PLeaS partially matches nodes in each layer by maximizing alignment.
It computes the weights of the merged model as a layer-wise Least Squares solution.
arXiv Detail & Related papers (2024-07-02T17:24:04Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Adapt & Align: Continual Learning with Generative Models Latent Space
Alignment [15.729732755625474]
We introduce Adapt & Align, a method for continual learning of neural networks by aligning latent representations in generative models.
Neural Networks suffer from abrupt loss in performance when retrained with additional data.
We propose a new method that mitigates those problems by employing generative models and splitting the process of their update into two parts.
arXiv Detail & Related papers (2023-12-21T10:02:17Z) - Heterogeneous Federated Learning Using Knowledge Codistillation [23.895665011884102]
We propose a method that involves training a small model on the entire pool and a larger model on a subset of clients with higher capacity.
The models exchange information bidirectionally via knowledge distillation, utilizing an unlabeled dataset on a server without sharing parameters.
arXiv Detail & Related papers (2023-10-04T03:17:26Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.