Fast Training of Sinusoidal Neural Fields via Scaling Initialization
- URL: http://arxiv.org/abs/2410.04779v2
- Date: Fri, 28 Feb 2025 14:20:04 GMT
- Title: Fast Training of Sinusoidal Neural Fields via Scaling Initialization
- Authors: Taesun Yeom, Sangyoon Lee, Jaeho Lee,
- Abstract summary: We focus on a popular family of neural fields, called sinusoidal neural fields (SNFs)<n>We show that by simply multiplying each weight by a constant, we can accelerate SNF training by 10$times$.
- Score: 16.912112402718584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural fields are an emerging paradigm that represent data as continuous functions parameterized by neural networks. Despite many advantages, neural fields often have a high training cost, which prevents a broader adoption. In this paper, we focus on a popular family of neural fields, called sinusoidal neural fields (SNFs), and study how it should be initialized to maximize the training speed. We find that the standard initialization scheme for SNFs -- designed based on the signal propagation principle -- is suboptimal. In particular, we show that by simply multiplying each weight (except for the last layer) by a constant, we can accelerate SNF training by 10$\times$. This method, coined $\textit{weight scaling}$, consistently provides a significant speedup over various data domains, allowing the SNFs to train faster than more recently proposed architectures. To understand why the weight scaling works well, we conduct extensive theoretical and empirical analyses which reveal that the weight scaling not only resolves the spectral bias quite effectively but also enjoys a well-conditioned optimization trajectory.
Related papers
- Sinusoidal Initialization, Time for a New Start [0.5242869847419834]
Initialization plays a critical role in Deep Neural Network training, directly influencing convergence, stability, and generalization.<n>We introduce a novel deterministic method that employs sinusoidal functions to construct structured weight matrices to improve the spread and balance of weights throughout the network.<n>Our experiments show an increase of 4.9% in final validation accuracy and 20.9% in convergence speed.
arXiv Detail & Related papers (2025-05-19T09:45:18Z) - Efficient Event-based Delay Learning in Spiking Neural Networks [0.1350479308585481]
Spiking Neural Networks (SNNs) are attracting increased attention as an energy-efficient alternative to traditional Neural Networks.
We propose a novel event-based training method for SNNs, grounded in the EventPropProp formalism.
We show that our approach uses less than half the memory of the current state-of-the-art delay-learning method and is up to 26x faster.
arXiv Detail & Related papers (2025-01-13T13:44:34Z) - ETTFS: An Efficient Training Framework for Time-to-First-Spike Neuron [38.194529226257735]
Time-to-First-Spike (TTFS) coding, where neurons fire only once during inference, offers the benefits of reduced spike counts, enhanced energy efficiency, and faster processing.
This paper presents an efficient training framework for TTFS that not only improves accuracy but also accelerates the training process.
arXiv Detail & Related papers (2024-10-31T04:14:47Z) - Deep activity propagation via weight initialization in spiking neural networks [10.69085409825724]
Spiking Neural Networks (SNNs) offer bio-inspired advantages such as sparsity and ultra-low power consumption.
Deep SNNs process and transmit information by quantizing the real-valued membrane potentials into binary spikes.
We show theoretically that, unlike standard approaches, this method enables the propagation of activity in deep SNNs without loss of spikes.
arXiv Detail & Related papers (2024-10-01T11:02:34Z) - Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks [1.5124439914522694]
We introduce a theoretical framework that explains the capacity property of sinusoidal networks.
We show how its layer compositions produce a large number of new frequencies expressed as integer combinations of the input frequencies.
Our method, referred to as TUNER, greatly improves the stability and convergence of sinusoidal INR training, leading to detailed reconstructions.
arXiv Detail & Related papers (2024-07-30T18:24:46Z) - Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint [5.9954962391837885]
We study the gradient descent dynamics of neural networks through the lens of macroscopic limits.
Our study reveals that gradient descent can rapidly drive deep neural networks to zero training loss.
Our approach draws inspiration from the Neural Tangent Kernel (NTK) paradigm.
arXiv Detail & Related papers (2024-04-07T08:07:02Z) - Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network.
We provide analytical expressions for these speed limits for linear and linearizable neural networks.
Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.