Shrinkage Initialization for Smooth Learning of Neural Networks
- URL: http://arxiv.org/abs/2504.09107v1
- Date: Sat, 12 Apr 2025 07:11:35 GMT
- Title: Shrinkage Initialization for Smooth Learning of Neural Networks
- Authors: Miao Cheng, Feiyan Zhou, Hongwei Zou, Limin Wang,
- Abstract summary: An improved approach to neural learning is presented, which adopts the shrinkage approach to initialize the transformation of each layer of networks.<n>It can be universally adapted for the structures of any networks with random layers, while stable performance can be attained.
- Score: 14.176838845627744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The successes of intelligent systems have quite relied on the artificial learning of information, which lead to the broad applications of neural learning solutions. As a common sense, the training of neural networks can be largely improved by specifically defined initialization, neuron layers as well as the activation functions. Though there are sequential layer based initialization available, the generalized solution to initial stages is still desired. In this work, an improved approach to initialization of neural learning is presented, which adopts the shrinkage approach to initialize the transformation of each layer of networks. It can be universally adapted for the structures of any networks with random layers, while stable performance can be attained. Furthermore, the smooth learning of networks is adopted in this work, due to the diverse influence on neural learning. Experimental results on several artificial data sets demonstrate that, the proposed method is able to present robust results with the shrinkage initialization, and competent for smooth learning of neural networks.
Related papers
- From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation.<n>Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically.<n>These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - NeuralFastLAS: Fast Logic-Based Learning from Raw Data [54.938128496934695]
Symbolic rule learners generate interpretable solutions, however they require the input to be encoded symbolically.
Neuro-symbolic approaches overcome this issue by mapping raw data to latent symbolic concepts using a neural network.
We introduce NeuralFastLAS, a scalable and fast end-to-end approach that trains a neural network jointly with a symbolic learner.
arXiv Detail & Related papers (2023-10-08T12:33:42Z) - Enhanced quantum state preparation via stochastic prediction of neural
network [0.8287206589886881]
In this paper, we explore an intriguing avenue for enhancing algorithm effectiveness through exploiting the knowledge blindness of neural network.
Our approach centers around a machine learning algorithm utilized for preparing arbitrary quantum states in a semiconductor double quantum dot system.
By leveraging prediction generated by the neural network, we are able to guide the optimization process to escape local optima.
arXiv Detail & Related papers (2023-07-27T09:11:53Z) - Multi-Grade Deep Learning [3.0069322256338906]
Current deep learning model is of a single-grade neural network.
We propose a multi-grade learning model that enables us to learn deep neural network much more effectively and efficiently.
arXiv Detail & Related papers (2023-02-01T00:09:56Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - A Weight Initialization Based on the Linear Product Structure for Neural
Networks [0.0]
We study neural networks from a nonlinear point of view and propose a novel weight initialization strategy that is based on the linear product structure (LPS) of neural networks.
The proposed strategy is derived from the approximation of activation functions by using theories of numerical algebra to guarantee to find all the local minima.
arXiv Detail & Related papers (2021-09-01T00:18:59Z) - Supervised Learning with First-to-Spike Decoding in Multilayer Spiking
Neural Networks [0.0]
We propose a new supervised learning method that can train multilayer spiking neural networks to solve classification problems.
The proposed learning rule supports multiple spikes fired by hidden neurons, and yet is stable by relying on firstspike responses generated by a deterministic output layer.
We also explore several distinct spike-based encoding strategies in order to form compact representations of input data.
arXiv Detail & Related papers (2020-08-16T15:34:48Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.