Variance-Aware Weight Initialization for Point Convolutional Neural
Networks
- URL: http://arxiv.org/abs/2112.03777v1
- Date: Tue, 7 Dec 2021 15:47:14 GMT
- Title: Variance-Aware Weight Initialization for Point Convolutional Neural
Networks
- Authors: Pedro Hermosilla and Michael Schelling and Tobias Ritschel and Timo
Ropinski
- Abstract summary: We propose a framework to unify the multitude of continuous convolutions.
We show that this framework can avoid batch normalization while achieving similar and, in some cases, better performance.
- Score: 23.46612653627991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Appropriate weight initialization has been of key importance to successfully
train neural networks. Recently, batch normalization has diminished the role of
weight initialization by simply normalizing each layer based on batch
statistics. Unfortunately, batch normalization has several drawbacks when
applied to small batch sizes, as they are required to cope with memory
limitations when learning on point clouds. While well-founded weight
initialization strategies can render batch normalization unnecessary and thus
avoid these drawbacks, no such approaches have been proposed for point
convolutional networks. To fill this gap, we propose a framework to unify the
multitude of continuous convolutions. This enables our main contribution,
variance-aware weight initialization. We show that this initialization can
avoid batch normalization while achieving similar and, in some cases, better
performance.
Related papers
- Post-Training Quantization for Re-parameterization via Coarse & Fine
Weight Splitting [13.270381125055275]
We propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight.
We develop an improved KL metric to determine optimal quantization scales for activation.
For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss.
arXiv Detail & Related papers (2023-12-17T02:31:20Z) - Batchless Normalization: How to Normalize Activations Across Instances with Minimal Memory Requirements [0.0]
In training neural networks, batch normalization has many benefits, not all of them entirely understood.
In this paper I show a simple and straightforward way to address these issues.
Among other benefits, this will hopefully contribute to the democratization of AI research by means of lowering the hardware requirements for training larger models.
arXiv Detail & Related papers (2022-12-30T14:15:54Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - ZerO Initialization: Initializing Residual Networks with only Zeros and
Ones [44.66636787050788]
Deep neural networks are usually with random weights, with adequately selected initial variance to ensure stable signal propagation during training.
There is no consensus on how to select the variance, and this becomes challenging as the number of layers grows.
In this work, we replace the widely used random weight initialization with a fully deterministic initialization scheme ZerO, which initializes residual networks with only zeros and ones.
Surprisingly, we find that ZerO achieves state-of-the-art performance over various image classification datasets, including ImageNet.
arXiv Detail & Related papers (2021-10-25T06:17:33Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Fractional moment-preserving initialization schemes for training deep
neural networks [1.14219428942199]
A traditional approach to deep neural networks (DNNs) is to sample the network weights randomly for preserving the variance of pre-activations.
In this paper, we show that weights and therefore pre-activations can be modeled with a heavy-tailed distribution.
We show through numerical experiments that our schemes can improve the training and test performance.
arXiv Detail & Related papers (2020-05-25T01:10:01Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - Gradient $\ell_1$ Regularization for Quantization Robustness [70.39776106458858]
We derive a simple regularization scheme that improves robustness against post-training quantization.
By training quantization-ready networks, our approach enables storing a single set of weights that can be quantized on-demand to different bit-widths.
arXiv Detail & Related papers (2020-02-18T12:31:34Z) - Cross-Iteration Batch Normalization [67.83430009388678]
We present Cross-It Batch Normalization (CBN), in which examples from multiple recent iterations are jointly utilized to enhance estimation quality.
CBN is found to outperform the original batch normalization and a direct calculation of statistics over previous iterations without the proposed compensation technique.
arXiv Detail & Related papers (2020-02-13T18:52:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.