Related papers: Neural Networks at a Fraction with Pruned Quaternions

Neural Networks at a Fraction with Pruned Quaternions

URL: http://arxiv.org/abs/2308.06780v1
Date: Sun, 13 Aug 2023 14:25:54 GMT
Title: Neural Networks at a Fraction with Pruned Quaternions
Authors: Sahel Mohammad Iqbal and Subhankar Mishra
Abstract summary: Pruning is one technique to remove unnecessary weights and reduce resource requirements for training and inference. For ML tasks where the input data is multi-dimensional, using higher-dimensional data embeddings such as complex numbers or quaternions has been shown to reduce the parameter count while maintaining accuracy. We find that for some architectures, at very high sparsity levels, quaternion models provide higher accuracies than their real counterparts.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Contemporary state-of-the-art neural networks have increasingly large numbers of parameters, which prevents their deployment on devices with limited computational power. Pruning is one technique to remove unnecessary weights and reduce resource requirements for training and inference. In addition, for ML tasks where the input data is multi-dimensional, using higher-dimensional data embeddings such as complex numbers or quaternions has been shown to reduce the parameter count while maintaining accuracy. In this work, we conduct pruning on real and quaternion-valued implementations of different architectures on classification tasks. We find that for some architectures, at very high sparsity levels, quaternion models provide higher accuracies than their real counterparts. For example, at the task of image classification on CIFAR-10 using Conv-4, at $3\%$ of the number of parameters as the original model, the pruned quaternion version outperforms the pruned real by more than $10\%$. Experiments on various network architectures and datasets show that for deployment in extremely resource-constrained environments, a sparse quaternion network might be a better candidate than a real sparse model of similar architecture.

Related papers

Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters. In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z)
Low-Resource Crop Classification from Multi-Spectral Time Series Using Lossless Compressors [6.379065975644869]
Deep learning has significantly improved the accuracy of crop classification using multispectral temporal data. In low-resource situations with fewer labeled samples, deep learning models perform poorly due to insufficient data. We propose a non-training alternative to deep learning models, aiming to address these situations.
arXiv Detail & Related papers (2024-05-28T12:28:12Z)
Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets. DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z)
Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x. We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z)
Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting [33.35713740886292]
Sparse deep neural networks can substantially reduce the complexity and memory consumption of the models. Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels. In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model.
arXiv Detail & Related papers (2021-12-21T01:35:51Z)
Efficient deep learning models for land cover image classification [0.29748898344267777]
This work experiments with the BigEarthNet dataset for land use land cover (LULC) image classification. We benchmark different state-of-the-art models, including Convolution Neural Networks, Multi-Layer Perceptrons, Visual Transformers, EfficientNets and Wide Residual Networks (WRN) Our proposed lightweight model has an order of magnitude less trainable parameters, achieves 4.5% higher averaged f-score classification accuracy for all 19 LULC classes and is trained two times faster with respect to a ResNet50 state-of-the-art model that we use as a baseline.
arXiv Detail & Related papers (2021-11-18T00:03:14Z)
Compact representations of convolutional neural networks via weight pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization. We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z)
Depthwise Multiception Convolution for Reducing Network Parameters without Sacrificing Accuracy [2.0088802641040604]
Multiception convolution introduces layer-wise multiscale kernels to learn representations of all individual input channels simultaneously. It significantly reduces the number of parameters of standard convolution-based models by 32.48% on average while still preserving accuracy.
arXiv Detail & Related papers (2020-11-07T05:33:54Z)
Generative Multi-Stream Architecture For American Sign Language Recognition [15.717424753251674]
Training on datasets with low feature-richness for complex applications limit optimal convergence below human performance. We propose a generative multistream architecture, eliminating the need for additional hardware with the intent to improve feature convergence without risking impracticability. Our methods have achieved 95.62% validation accuracy with a variance of 1.42% from training, outperforming past models by 0.45% in validation accuracy and 5.53% in variance.
arXiv Detail & Related papers (2020-03-09T21:04:51Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec. PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.