Related papers: Bit-wise Training of Neural Network Weights

Bit-wise Training of Neural Network Weights

URL: http://arxiv.org/abs/2202.09571v1
Date: Sat, 19 Feb 2022 10:46:54 GMT
Title: Bit-wise Training of Neural Network Weights
Authors: Cristian Ivan
Abstract summary: We introduce an algorithm where the individual bits representing the weights of a neural network are learned. This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks. We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks.
Score: 4.56877715768796
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce an algorithm where the individual bits representing the weights of a neural network are learned. This method allows training weights with integer values on arbitrary bit-depths and naturally uncovers sparse networks, without additional constraints or regularization techniques. We show better results than the standard training technique with fully connected networks and similar performance as compared to standard training for convolutional and residual networks. By training bits in a selective manner we found that the biggest contribution to achieving high accuracy is given by the first three most significant bits, while the rest provide an intrinsic regularization. As a consequence more than 90\% of a network can be used to store arbitrary codes without affecting its accuracy. These codes may be random noise, binary files or even the weights of previously trained networks.

Related papers

Gradient-Free Neural Network Training on the Edge [12.472204825917629]
Training neural networks is computationally heavy and energy-intensive. This work presents a novel technique for training neural networks without needing gradient. We show that it is possible to train models without gradient-based optimization techniques by identifying erroneous contributions of each neuron towards the expected classification.
arXiv Detail & Related papers (2024-10-13T05:38:39Z)
Training Your Sparse Neural Network Better with Any Mask [106.134361318518]
Pruning large neural networks to create high-quality, independently trainable sparse masks is desirable. In this paper we demonstrate an alternative opportunity: one can customize the sparse training techniques to deviate from the default dense network training protocols. Our new sparse training recipe is generally applicable to improving training from scratch with various sparse masks.
arXiv Detail & Related papers (2022-06-26T00:37:33Z)
FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin. We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z)
Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network [13.193734014710582]
We propose an algorithm for finding multi-prize tickets (MPTs) and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets. Our MPTs-1/32 not only set new binary weight network state-of-the-art (SOTA) Top-1 accuracy -- 94.8% on CIFAR-10 and 74.03% on ImageNet -- but also outperform their full-precision counterparts by 1.78% and 0.76%, respectively.
arXiv Detail & Related papers (2021-03-17T00:31:24Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step. Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z)
Training highly effective connectivities within neural networks with randomly initialized, fixed weights [4.56877715768796]
We introduce a novel way of training a network by flipping the signs of the weights. We obtain good results even with weights constant magnitude or even when weights are drawn from highly asymmetric distributions.
arXiv Detail & Related papers (2020-06-30T09:41:18Z)
ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z)
Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights. Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)
Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network. In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.