$C^2M^3$: Cycle-Consistent Multi-Model Merging
- URL: http://arxiv.org/abs/2405.17897v2
- Date: Wed, 30 Oct 2024 07:18:46 GMT
- Title: $C^2M^3$: Cycle-Consistent Multi-Model Merging
- Authors: Donato Crisostomi, Marco Fumero, Daniele Baieri, Florian Bernard, Emanuele RodolĂ ,
- Abstract summary: We present a novel data-free method for merging neural networks in weight space.
We enforce cycle consistency of the permutations when merging $N geq 3$ models.
- Score: 27.845750236662575
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a novel data-free method for merging neural networks in weight space. Differently from most existing works, our method optimizes for the permutations of network neurons globally across all layers. This allows us to enforce cycle consistency of the permutations when merging $N \geq 3$ models, allowing circular compositions of permutations to be computed without accumulating error along the path. We qualitatively and quantitatively motivate the need for such a constraint, showing its benefits when merging sets of models in scenarios spanning varying architectures and datasets. We finally show that, when coupled with activation renormalization, our approach yields the best results in the task.
Related papers
- Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training [3.195234044113248]
We propose textscNeuroAL, a emphtop-up algorithm for network pruning.
It modifies the block-wise and row-wise sparsity exploiting information from both the dense model and its sparse version.
It consistently outperforms the latest state-of-the-art methods in terms of performance-runtime trade-off.
arXiv Detail & Related papers (2024-11-11T15:30:16Z) - Provable Imbalanced Point Clustering [19.74926864871558]
We suggest efficient and provable methods to compute an approximation for imbalanced point clustering.
We provide experiments that show the empirical contribution of our suggested methods for real images (novel and reference), synthetic data, and real-world data.
arXiv Detail & Related papers (2024-08-26T12:41:41Z) - Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis [17.989809995141044]
We propose CCA Merge, which is based on Corre Analysis Analysis.
We show that CCA works significantly better than past methods when more than 2 models are merged.
arXiv Detail & Related papers (2024-07-07T14:21:04Z) - Neural Inverse Kinematics [72.85330210991508]
Inverse kinematic (IK) methods recover the parameters of the joints, given the desired position of selected elements in the kinematic chain.
We propose a neural IK method that employs the hierarchical structure of the problem to sequentially sample valid joint angles conditioned on the desired position.
arXiv Detail & Related papers (2022-05-22T14:44:07Z) - Random Manifold Sampling and Joint Sparse Regularization for Multi-label
Feature Selection [0.0]
The model proposed in this paper can obtain the most relevant few features by solving the joint constrained optimization problems of $ell_2,1$ and $ell_F$ regularization.
Comparative experiments on real-world data sets show that the proposed method outperforms other methods.
arXiv Detail & Related papers (2022-04-13T15:06:12Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Fitting large mixture models using stochastic component selection [0.0]
We propose a combination of the expectation of the computational and the Metropolis-Hastings algorithm to evaluate only a small number of components.
The Markov chain of component assignments is sequentially generated across the algorithm's iterations.
We put emphasis on generality of our method, equipping it with the ability to train both shallow and deep mixture models.
arXiv Detail & Related papers (2021-10-10T12:39:53Z) - Eliminating Multicollinearity Issues in Neural Network Ensembles:
Incremental, Negatively Correlated, Optimal Convex Blending [0.2294014185517203]
We introduce an incremental algorithm that constructs an aggregate regressor, using an ensemble of neural networks.
We optimally blend the aggregate regressor with a newly trained neural network under a convexity constraint.
Under this framework, collinearity issues do not arise at all, rendering so the method both accurate and robust.
arXiv Detail & Related papers (2021-04-30T01:32:08Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Neural Subdivision [58.97214948753937]
This paper introduces Neural Subdivision, a novel framework for data-driven coarseto-fine geometry modeling.
We optimize for the same set of network weights across all local mesh patches, thus providing an architecture that is not constrained to a specific input mesh, fixed genus, or category.
We demonstrate that even when trained on a single high-resolution mesh our method generates reasonable subdivisions for novel shapes.
arXiv Detail & Related papers (2020-05-04T20:03:21Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.