Weight Fixing Networks
- URL: http://arxiv.org/abs/2210.13554v1
- Date: Mon, 24 Oct 2022 19:18:02 GMT
- Title: Weight Fixing Networks
- Authors: Christopher Subia-Waud and Srinandan Dasmahapatra
- Abstract summary: We look to whole-network quantisation to minimise the entropy and number of unique parameters in a network.
We propose a new method, which we call Weight Fixing Networks (WFN) that we design to realise four model outcome objectives.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern iterations of deep learning models contain millions (billions) of
unique parameters, each represented by a b-bit number. Popular attempts at
compressing neural networks (such as pruning and quantisation) have shown that
many of the parameters are superfluous, which we can remove (pruning) or
express with less than b-bits (quantisation) without hindering performance.
Here we look to go much further in minimising the information content of
networks. Rather than a channel or layer-wise encoding, we look to lossless
whole-network quantisation to minimise the entropy and number of unique
parameters in a network. We propose a new method, which we call Weight Fixing
Networks (WFN) that we design to realise four model outcome objectives: i) very
few unique weights, ii) low-entropy weight encodings, iii) unique weight values
which are amenable to energy-saving versions of hardware multiplication, and
iv) lossless task-performance. Some of these goals are conflicting. To best
balance these conflicts, we combine a few novel (and some well-trodden) tricks;
a novel regularisation term, (i, ii) a view of clustering cost as relative
distance change (i, ii, iv), and a focus on whole-network re-use of weights (i,
iii). Our Imagenet experiments demonstrate lossless compression using 56x fewer
unique weights and a 1.9x lower weight-space entropy than SOTA quantisation
approaches.
Related papers
- Neural Metamorphosis [72.88137795439407]
This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks.
NeuMeta directly learns the continuous weight manifold of neural networks.
It sustains full-size performance even at a 75% compression rate.
arXiv Detail & Related papers (2024-10-10T14:49:58Z) - Post-Training Quantization for Re-parameterization via Coarse & Fine
Weight Splitting [13.270381125055275]
We propose a coarse & fine weight splitting (CFWS) method to reduce quantization error of weight.
We develop an improved KL metric to determine optimal quantization scales for activation.
For example, the quantized RepVGG-A1 model exhibits a mere 0.3% accuracy loss.
arXiv Detail & Related papers (2023-12-17T02:31:20Z) - Learning to Compose SuperWeights for Neural Parameter Allocation Search [61.078949532440724]
We show that our approach can generate parameters for many network using the same set of weights.
This enables us to support tasks like efficient ensembling and anytime prediction.
arXiv Detail & Related papers (2023-12-03T04:20:02Z) - Random Weights Networks Work as Loss Prior Constraint for Image
Restoration [50.80507007507757]
We present our belief Random Weights Networks can be Acted as Loss Prior Constraint for Image Restoration''
Our belief can be directly inserted into existing networks without any training and testing computational cost.
To emphasize, our main focus is to spark the realms of loss function and save their current neglected status.
arXiv Detail & Related papers (2023-03-29T03:43:51Z) - Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware
Transformation [31.546529106932205]
Frequency-Aware Transformation (FAT) learns to transform network weights in the frequency domain before quantization.
FAT can be easily trained in low precision using simple standard quantizers.
Code will be available soon.
arXiv Detail & Related papers (2021-02-15T10:35:20Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - Exploiting Weight Redundancy in CNNs: Beyond Pruning and Quantization [0.2538209532048866]
Pruning and quantization are proven methods for improving the performance and storage efficiency of convolutional neural networks (CNNs)
We identify another form of redundancy in CNN weight tensors, in the form of repeated patterns of similar values.
arXiv Detail & Related papers (2020-06-22T01:54:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.