How to Train Your Metamorphic Deep Neural Network
- URL: http://arxiv.org/abs/2505.05510v1
- Date: Wed, 07 May 2025 09:01:06 GMT
- Title: How to Train Your Metamorphic Deep Neural Network
- Authors: Thomas Sommariva, Simone Calderara, Angelo Porrello,
- Abstract summary: Neural Metamorphosis (NeuMeta) is a recent paradigm for generating neural networks of varying width and depth.<n>We propose a training algorithm that extends the capabilities of NeuMeta to enable full-network metamorphosis with minimal accuracy degradation.
- Score: 16.734594915917647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Metamorphosis (NeuMeta) is a recent paradigm for generating neural networks of varying width and depth. Based on Implicit Neural Representation (INR), NeuMeta learns a continuous weight manifold, enabling the direct generation of compressed models, including those with configurations not seen during training. While promising, the original formulation of NeuMeta proves effective only for the final layers of the undelying model, limiting its broader applicability. In this work, we propose a training algorithm that extends the capabilities of NeuMeta to enable full-network metamorphosis with minimal accuracy degradation. Our approach follows a structured recipe comprising block-wise incremental training, INR initialization, and strategies for replacing batch normalization. The resulting metamorphic networks maintain competitive accuracy across a wide range of compression ratios, offering a scalable solution for adaptable and efficient deployment of deep models. The code is available at: https://github.com/TSommariva/HTTY_NeuMeta.
Related papers
- NeRF-based CBCT Reconstruction needs Normalization and Initialization [53.58395475423445]
NeRF-based methods suffer from a local-global training mismatch between their two key components: the hash encoder and the neural network.<n>We introduce a Normalized Hash, which enhances feature consistency and mitigates the mismatch.<n>The neural network exhibits improved stability during early training, enabling faster convergence and enhanced reconstruction performance.
arXiv Detail & Related papers (2025-06-24T16:01:45Z) - Neural Metamorphosis [72.88137795439407]
This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks.
NeuMeta directly learns the continuous weight manifold of neural networks.
It sustains full-size performance even at a 75% compression rate.
arXiv Detail & Related papers (2024-10-10T14:49:58Z) - Structure-Preserving Network Compression Via Low-Rank Induced Training Through Linear Layers Composition [11.399520888150468]
We present a theoretically-justified technique termed Low-Rank Induced Training (LoRITa)
LoRITa promotes low-rankness through the composition of linear layers and compresses by using singular value truncation.
We demonstrate the effectiveness of our approach using MNIST on Fully Connected Networks, CIFAR10 on Vision Transformers, and CIFAR10/100 and ImageNet on Convolutional Neural Networks.
arXiv Detail & Related papers (2024-05-06T00:58:23Z) - MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters [19.358670728803336]
Most deep neural networks are trained under fixed network architectures and require retraining when the architecture changes.
To avoid this, one can grow from a small network by adding random weights over time to gradually achieve the target network size.
This naive approach falls short in practice as it brings too much noise to the growing process.
arXiv Detail & Related papers (2023-11-07T11:37:08Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step.
Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Neural Network Compression Framework for fast model inference [59.65531492759006]
We present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF)
It leverages recent advances of various network compression methods and implements some of them, such as sparsity, quantization, and binarization.
The framework can be used within the training samples, which are supplied with it, or as a standalone package that can be seamlessly integrated into the existing training code.
arXiv Detail & Related papers (2020-02-20T11:24:01Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.