Sparsity-Control Ternary Weight Networks
- URL: http://arxiv.org/abs/2011.00580v2
- Date: Fri, 22 Oct 2021 01:10:12 GMT
- Title: Sparsity-Control Ternary Weight Networks
- Authors: Xiang Deng and Zhongfei Zhang
- Abstract summary: We focus on training ternary weight -1, 0, +1 networks which can avoid multiplications and dramatically reduce the memory and requirements.
Existing approaches to training ternary weight networks cannot control the sparsity of the ternary weights.
We propose the first sparsity-control approach (SCA) to training ternary weight networks.
- Score: 34.00378876525579
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks (DNNs) have been widely and successfully applied to
various applications, but they require large amounts of memory and
computational power. This severely restricts their deployment on
resource-limited devices. To address this issue, many efforts have been made on
training low-bit weight DNNs. In this paper, we focus on training ternary
weight \{-1, 0, +1\} networks which can avoid multiplications and dramatically
reduce the memory and computation requirements. A ternary weight network can be
considered as a sparser version of the binary weight counterpart by replacing
some -1s or 1s in the binary weights with 0s, thus leading to more efficient
inference but more memory cost. However, the existing approaches to training
ternary weight networks cannot control the sparsity (i.e., percentage of 0s) of
the ternary weights, which undermines the advantage of ternary weights. In this
paper, we propose to our best knowledge the first sparsity-control approach
(SCA) to training ternary weight networks, which is simply achieved by a weight
discretization regularizer (WDR). SCA is different from all the existing
regularizer-based approaches in that it can control the sparsity of the ternary
weights through a controller $\alpha$ and does not rely on gradient estimators.
We theoretically and empirically show that the sparsity of the trained ternary
weights is positively related to $\alpha$. SCA is extremely simple,
easy-to-implement, and is shown to consistently outperform the state-of-the-art
approaches significantly over several benchmark datasets and even matches the
performances of the full-precision weight counterparts.
Related papers
- OvSW: Overcoming Silent Weights for Accurate Binary Neural Networks [19.41917323210239]
We investigate the efficiency of weight sign updates in Binary Neural Networks(BNNs)
For vanilla BNNs, over 50% of the weights remain their signs unchanged during training.
We propose Overcome Silent Weights(OvSW) to address the issue.
arXiv Detail & Related papers (2024-07-07T05:01:20Z) - Improved Generalization of Weight Space Networks via Augmentations [53.87011906358727]
Learning in deep weight spaces (DWS) is an emerging research direction, with applications to 2D and 3D neural fields (INRs, NeRFs)
We empirically analyze the reasons for this overfitting and find that a key reason is the lack of diversity in DWS datasets.
To address this, we explore strategies for data augmentation in weight spaces and propose a MixUp method adapted for weight spaces.
arXiv Detail & Related papers (2024-02-06T15:34:44Z) - ELSA: Partial Weight Freezing for Overhead-Free Sparse Network
Deployment [95.04504362111314]
We present ELSA, a practical solution for creating deep networks that can easily be deployed at different levels of sparsity.
The core idea is to embed one or more sparse networks within a single dense network as a proper subset of the weights.
At prediction time, any sparse model can be extracted effortlessly simply be zeroing out weights according to a predefined mask.
arXiv Detail & Related papers (2023-12-11T22:44:05Z) - Weight Compander: A Simple Weight Reparameterization for Regularization [5.744133015573047]
We introduce weight compander, a novel effective method to improve generalization of deep neural networks.
We show experimentally that using weight compander in addition to standard regularization methods improves the performance of neural networks.
arXiv Detail & Related papers (2023-06-29T14:52:04Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural
Networks by Pruning A Randomly Weighted Network [13.193734014710582]
We propose an algorithm for finding multi-prize tickets (MPTs) and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets.
Our MPTs-1/32 not only set new binary weight network state-of-the-art (SOTA) Top-1 accuracy -- 94.8% on CIFAR-10 and 74.03% on ImageNet -- but also outperform their full-precision counterparts by 1.78% and 0.76%, respectively.
arXiv Detail & Related papers (2021-03-17T00:31:24Z) - SiMaN: Sign-to-Magnitude Network Binarization [165.5630656849309]
We show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise.
We prove that the learned weights of binarized networks roughly follow a Laplacian distribution that does not allow entropy.
Our method, dubbed sign-to- neural network binarization (SiMaN), is evaluated on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2021-02-16T07:03:51Z) - SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and
Training [82.35376405568975]
Deep neural networks (DNNs) come with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage.
We present SmartDeal (SD), an algorithm framework to trade higher-cost memory storage/access for lower-cost computation.
We show that SD leads to 10.56x and 4.48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
arXiv Detail & Related papers (2021-01-04T18:54:07Z) - Training Sparse Neural Networks using Compressed Sensing [13.84396596420605]
We develop and test a novel method based on compressed sensing which combines the pruning and training into a single step.
Specifically, we utilize an adaptively weighted $ell1$ penalty on the weights during training, which we combine with a generalization of the regularized dual averaging (RDA) algorithm in order to train sparse neural networks.
arXiv Detail & Related papers (2020-08-21T19:35:54Z) - Training highly effective connectivities within neural networks with
randomly initialized, fixed weights [4.56877715768796]
We introduce a novel way of training a network by flipping the signs of the weights.
We obtain good results even with weights constant magnitude or even when weights are drawn from highly asymmetric distributions.
arXiv Detail & Related papers (2020-06-30T09:41:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.