Related papers: Merging of neural networks

Merging of neural networks

URL: http://arxiv.org/abs/2204.09973v1
Date: Thu, 21 Apr 2022 08:52:54 GMT
Title: Merging of neural networks
Authors: Martin Pa\v{s}en, Vladim\'ir Bo\v{z}a
Abstract summary: We show that training two networks and merging them leads to better performance than training a single network for an extended period of time. Our procedure might be used as a finalization step after one tries multiple starting seeds to avoid an unlucky one.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a simple scheme for merging two neural networks trained with different starting initialization into a single one with the same size as the original ones. We do this by carefully selecting channels from each input network. Our procedure might be used as a finalization step after one tries multiple starting seeds to avoid an unlucky one. We also show that training two networks and merging them leads to better performance than training a single network for an extended period of time. Availability: https://github.com/fmfi-compbio/neural-network-merging

Related papers

Network Fission Ensembles for Low-Cost Self-Ensembles [20.103367702014474]
We propose a low-cost ensemble learning and inference, called Network Fission Ensembles (NFE) We first prune some of the weights to reduce the training burden. We then group the remaining weights into several sets and create multiple auxiliary paths using each set to construct multi-exits.
arXiv Detail & Related papers (2024-08-05T08:23:59Z)
Stitching for Neuroevolution: Recombining Deep Neural Networks without Breaking Them [0.0]
Traditional approaches to neuroevolution often start from scratch. Recombining trained networks is non-trivial because architectures and feature representations typically differ. We employ stitching, which merges the networks by introducing new layers at crossover points.
arXiv Detail & Related papers (2024-03-21T08:30:44Z)
StitchNet: Composing Neural Networks from Pre-Trained Fragments [3.638431342539701]
We propose StitchNet, a novel neural network creation paradigm. It stitches together fragments from multiple pre-trained neural networks. We show that these fragments can be stitched together to create neural networks with accuracy comparable to that of traditionally trained networks.
arXiv Detail & Related papers (2023-01-05T08:02:30Z)
Bit-wise Training of Neural Network Weights [4.56877715768796]
We introduce an algorithm where the individual bits representing the weights of a neural network are learned. This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks. We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks.
arXiv Detail & Related papers (2022-02-19T10:46:54Z)
FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin. We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z)
Simultaneous Training of Partially Masked Neural Networks [67.19481956584465]
We show that it is possible to train neural networks in such a way that a predefined 'core' subnetwork can be split-off from the trained full network with remarkable good performance. We show that training a Transformer with a low-rank core gives a low-rank model with superior performance than when training the low-rank model alone.
arXiv Detail & Related papers (2021-06-16T15:57:51Z)
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations [51.85020143716815]
We propose MutualNet to train a single network that can run at a diverse set of resource constraints. Our method trains a cohort of model configurations with various network widths and input resolutions. MutualNet is a general training methodology that can be applied to various network structures.
arXiv Detail & Related papers (2021-05-14T22:30:13Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Feature Sharing Cooperative Network for Semantic Segmentation [10.305130700118399]
We propose a semantic segmentation method using cooperative learning. By sharing feature maps, one of two networks can obtain the information that cannot be obtained by a single network. The proposed method achieved better segmentation accuracy than the conventional single network and ensemble of networks.
arXiv Detail & Related papers (2021-01-20T00:22:00Z)
Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN [80.17705319689139]
We propose a data-free knowledge amalgamate strategy to craft a well-behaved multi-task student network from multiple single/multi-task teachers. The proposed method without any training data achieves the surprisingly competitive results, even compared with some full-supervised methods.
arXiv Detail & Related papers (2020-03-20T03:20:52Z)
Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network. In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.