Related papers: VI3NR: Variance Informed Initialization for Implicit Neural Representations

VI3NR: Variance Informed Initialization for Implicit Neural Representations

URL: http://arxiv.org/abs/2504.19270v1
Date: Sun, 27 Apr 2025 14:58:27 GMT
Title: VI3NR: Variance Informed Initialization for Implicit Neural Representations
Authors: Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Sameera Ramasinghe, Stephen Gould,
Abstract summary: Implicit Neural Representations (INRs) are a versatile and powerful tool for encoding various forms of data, including images, videos, sound, and 3D shapes.<n>A critical factor in the success of INRs is the initialization of the network, which can significantly impact the convergence and accuracy of the learned model.
Score: 41.395193251292895
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Implicit Neural Representations (INRs) are a versatile and powerful tool for encoding various forms of data, including images, videos, sound, and 3D shapes. A critical factor in the success of INRs is the initialization of the network, which can significantly impact the convergence and accuracy of the learned model. Unfortunately, commonly used neural network initializations are not widely applicable for many activation functions, especially those used by INRs. In this paper, we improve upon previous initialization methods by deriving an initialization that has stable variance across layers, and applies to any activation function. We show that this generalizes many previous initialization methods, and has even better stability for well studied activations. We also show that our initialization leads to improved results with INR activation functions in multiple signal modalities. Our approach is particularly effective for Gaussian INRs, where we demonstrate that the theory of our initialization matches with task performance in multiple experiments, allowing us to achieve improvements in image, audio, and 3D surface reconstruction.

Related papers

A Theory of How Pretraining Shapes Inductive Bias in Fine-Tuning [51.505728136705564]
We develop an analytical theory of the pretraining-fine-tuning pipeline in diagonal linear networks.<n>We find that different initialization choices place the network into four distinct fine-tuning regimes.<n>A smaller scale in earlier layers enables the network to both reuse and refine its features, leading to superior generalization.
arXiv Detail & Related papers (2026-02-23T17:19:33Z)
A new initialisation to Control Gradients in Sinusoidal Neural network [9.341735544356167]
We propose a new initialisation for networks with sinusoidal activation functions such as textttSIREN.<n> Controlling both gradients and targeting vanishing pre-activation helps preventing the emergence of inappropriate frequencies during estimation.<n>New initialisation consistently outperforms state-of-the-art methods across a wide range of reconstruction tasks.
arXiv Detail & Related papers (2025-12-06T13:23:03Z)
Optimized Weight Initialization on the Stiefel Manifold for Deep ReLU Neural Networks [5.363441578662801]
Improper weight training of ReLU networks can cause inactivation dying ReLU and exacerbate instability as network depth increases.<n>We introduce an optimization problem on the Stiefel manifold, thereby preserving scale and calibrating the pre-activation statistics.<n>We show that prevention of the dying ReLU problem, slower decay of activation variance, and mitigation of gradient vanishing, which together stabilize signal and gradient flow in deep architectures.
arXiv Detail & Related papers (2025-08-30T05:17:31Z)
SL$^{2}$A-INR: Single-Layer Learnable Activation for Implicit Neural Representation [6.572456394600755]
Implicit Neural Representation (INR) leveraging a neural network to transform coordinate input into corresponding attributes has driven significant advances in vision-related domains.<n>We show that these challenges can be alleviated by introducing a novel approach in INR architecture.<n>Specifically, we propose SL$2$A-INR, a hybrid network that combines a single-layer learnable activation function with an synthesis that uses traditional ReLU activations.
arXiv Detail & Related papers (2024-09-17T02:02:15Z)
Improved weight initialization for deep and narrow feedforward neural network [3.0784574277021397]
The problem of textquotedblleft dying ReLU," where ReLU neurons become inactive and yield zero output, presents a significant challenge in the training of deep neural networks with ReLU activation function. We propose a novel weight initialization method to address this issue.
arXiv Detail & Related papers (2023-11-07T05:28:12Z)
ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z)
Principles for Initialization and Architecture Selection in Graph Neural Networks with ReLU Activations [17.51364577113718]
We show three principles for architecture selection in finite width graph neural networks (GNNs) with ReLU activations. First, we theoretically derive what is essentially the unique generalization to ReLU GNNs of the well-known He-initialization. Second, we prove in finite width vanilla ReLU GNNs that oversmoothing is unavoidable at large depth when using fixed aggregation operator.
arXiv Detail & Related papers (2023-06-20T16:40:41Z)
Neural Characteristic Activation Analysis and Geometric Parameterization for ReLU Networks [2.2713084727838115]
We introduce a novel approach for analyzing the training dynamics of ReLU networks by examining the characteristic activation boundaries of individual neurons. Our proposed analysis reveals a critical instability in common neural network parameterizations and normalizations during convergence optimization, which impedes fast convergence and hurts performance.
arXiv Detail & Related papers (2023-05-25T10:19:13Z)
Modality-Agnostic Variational Compression of Implicit Neural Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR) Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z)
Versatile Neural Processes for Learning Implicit Neural Representations [57.090658265140384]
We propose Versatile Neural Processes (VNP), which largely increases the capability of approximating functions. Specifically, we introduce a bottleneck encoder that produces fewer and informative context tokens, relieving the high computational cost. We demonstrate the effectiveness of the proposed VNP on a variety of tasks involving 1D, 2D and 3D signals.
arXiv Detail & Related papers (2023-01-21T04:08:46Z)
Learning A 3D-CNN and Transformer Prior for Hyperspectral Image Super-Resolution [80.93870349019332]
We propose a novel HSISR method that uses Transformer instead of CNN to learn the prior of HSIs. Specifically, we first use the gradient algorithm to solve the HSISR model, and then use an unfolding network to simulate the iterative solution processes.
arXiv Detail & Related papers (2021-11-27T15:38:57Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.