Shared-Weights Extender and Gradient Voting for Neural Network Expansion
- URL: http://arxiv.org/abs/2509.18842v1
- Date: Tue, 23 Sep 2025 09:27:47 GMT
- Title: Shared-Weights Extender and Gradient Voting for Neural Network Expansion
- Authors: Nikolas Chatzis, Ioannis Kordonis, Manos Theodosis, Petros Maragos,
- Abstract summary: Expanding neural networks during training is a promising way to augment capacity without retraining larger models from scratch.<n>Newly added neurons often fail to adjust to a trained network and become inactive, providing no contribution to capacity growth.<n>We propose a novel method explicitly designed to prevent inactivity of new neurons by coupling them with existing ones for smooth integration.
- Score: 15.3744306569115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Expanding neural networks during training is a promising way to augment capacity without retraining larger models from scratch. However, newly added neurons often fail to adjust to a trained network and become inactive, providing no contribution to capacity growth. We propose the Shared-Weights Extender (SWE), a novel method explicitly designed to prevent inactivity of new neurons by coupling them with existing ones for smooth integration. In parallel, we introduce the Steepest Voting Distributor (SVoD), a gradient-based method for allocating neurons across layers during deep network expansion. Our extensive benchmarking on four datasets shows that our method can effectively suppress neuron inactivity and achieve better performance compared to other expanding methods and baselines.
Related papers
- NAT: Learning to Attack Neurons for Enhanced Adversarial Transferability [77.1713948526578]
Neuron Attack for Transferability (NAT) is a method designed to target specific neuron within the embedding.<n>Our approach is motivated by the observation that previous layer-level optimizations often disproportionately focus on a few neurons.<n>We find that targeting individual neurons effectively disrupts the core units of the neural network.
arXiv Detail & Related papers (2025-08-23T08:06:31Z) - Long-Tailed Data Classification by Increasing and Decreasing Neurons During Training [4.32776344138537]
Real-world datasets often exhibit class imbalance situations where certain classes have far fewer samples than others.<n>We propose a method that periodically adds and removes neurons during training, thereby boosting representational power for minority classes.<n>Our results underscore the effectiveness of dynamic, biologically inspired network designs in improving performance on class-imbalanced data.
arXiv Detail & Related papers (2025-07-14T05:29:16Z) - Channel-wise Parallelizable Spiking Neuron with Multiplication-free Dynamics and Large Temporal Receptive Fields [32.349167886062105]
Spiking Neural Networks (SNNs) are distinguished from Artificial Neural Networks (ANNs) for their sophisticated neuronal dynamics and sparse binary activations (spikes) inspired by the biological neural system.<n>Traditional neuron models use iterative step-by-step dynamics, resulting in serial computation and slow training speed of SNNs.<n>Recent parallelizable spiking neuron models have been proposed to fully utilize the massive parallel computing ability of graphics processing units to accelerate the training of SNNs.
arXiv Detail & Related papers (2025-01-24T13:44:08Z) - EntryPrune: Neural Network Feature Selection using First Impressions [19.217750941193472]
EntryPrune is a novel supervised feature selection algorithm using a dense neural network with a dynamic sparse input layer.<n>It employs entry-based pruning, a novel approach that compares neurons based on their relative change induced when they have entered the network.
arXiv Detail & Related papers (2024-10-03T09:56:39Z) - Growing Deep Neural Network Considering with Similarity between Neurons [4.32776344138537]
We explore a novel approach of progressively increasing neuron numbers in compact models during training phases.
We propose a method that reduces feature extraction biases and neuronal redundancy by introducing constraints based on neuron similarity distributions.
Results on CIFAR-10 and CIFAR-100 datasets demonstrated accuracy improvement.
arXiv Detail & Related papers (2024-08-23T11:16:37Z) - Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning [38.09011520275557]
Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones.
We propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL.
arXiv Detail & Related papers (2024-06-04T15:47:03Z) - Fully Spiking Actor Network with Intra-layer Connections for
Reinforcement Learning [51.386945803485084]
We focus on the task where the agent needs to learn multi-dimensional deterministic policies to control.
Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected layer.
To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects.
arXiv Detail & Related papers (2024-01-09T07:31:34Z) - Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations [11.707981310045742]
End-to-end training of Deep Neural Networks (DNNs) yields state of the art performance in an increasing array of applications.
We report here on a promising neuro-inspired approach to perturbations with sparser and stronger activations.
arXiv Detail & Related papers (2022-02-26T06:19:05Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.