Neural Networks at a Fraction with Pruned Quaternions
- URL: http://arxiv.org/abs/2308.06780v1
- Date: Sun, 13 Aug 2023 14:25:54 GMT
- Title: Neural Networks at a Fraction with Pruned Quaternions
- Authors: Sahel Mohammad Iqbal and Subhankar Mishra
- Abstract summary: Pruning is one technique to remove unnecessary weights and reduce resource requirements for training and inference.
For ML tasks where the input data is multi-dimensional, using higher-dimensional data embeddings such as complex numbers or quaternions has been shown to reduce the parameter count while maintaining accuracy.
We find that for some architectures, at very high sparsity levels, quaternion models provide higher accuracies than their real counterparts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Contemporary state-of-the-art neural networks have increasingly large numbers
of parameters, which prevents their deployment on devices with limited
computational power. Pruning is one technique to remove unnecessary weights and
reduce resource requirements for training and inference. In addition, for ML
tasks where the input data is multi-dimensional, using higher-dimensional data
embeddings such as complex numbers or quaternions has been shown to reduce the
parameter count while maintaining accuracy. In this work, we conduct pruning on
real and quaternion-valued implementations of different architectures on
classification tasks. We find that for some architectures, at very high
sparsity levels, quaternion models provide higher accuracies than their real
counterparts. For example, at the task of image classification on CIFAR-10
using Conv-4, at $3\%$ of the number of parameters as the original model, the
pruned quaternion version outperforms the pruned real by more than $10\%$.
Experiments on various network architectures and datasets show that for
deployment in extremely resource-constrained environments, a sparse quaternion
network might be a better candidate than a real sparse model of similar
architecture.
Related papers
- Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - Low-Resource Crop Classification from Multi-Spectral Time Series Using Lossless Compressors [6.379065975644869]
Deep learning has significantly improved the accuracy of crop classification using multispectral temporal data.
In low-resource situations with fewer labeled samples, deep learning models perform poorly due to insufficient data.
We propose a non-training alternative to deep learning models, aiming to address these situations.
arXiv Detail & Related papers (2024-05-28T12:28:12Z) - Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - Compact Multi-level Sparse Neural Networks with Input Independent
Dynamic Rerouting [33.35713740886292]
Sparse deep neural networks can substantially reduce the complexity and memory consumption of the models.
Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels.
In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model.
arXiv Detail & Related papers (2021-12-21T01:35:51Z) - Efficient deep learning models for land cover image classification [0.29748898344267777]
This work experiments with the BigEarthNet dataset for land use land cover (LULC) image classification.
We benchmark different state-of-the-art models, including Convolution Neural Networks, Multi-Layer Perceptrons, Visual Transformers, EfficientNets and Wide Residual Networks (WRN)
Our proposed lightweight model has an order of magnitude less trainable parameters, achieves 4.5% higher averaged f-score classification accuracy for all 19 LULC classes and is trained two times faster with respect to a ResNet50 state-of-the-art model that we use as a baseline.
arXiv Detail & Related papers (2021-11-18T00:03:14Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Depthwise Multiception Convolution for Reducing Network Parameters
without Sacrificing Accuracy [2.0088802641040604]
Multiception convolution introduces layer-wise multiscale kernels to learn representations of all individual input channels simultaneously.
It significantly reduces the number of parameters of standard convolution-based models by 32.48% on average while still preserving accuracy.
arXiv Detail & Related papers (2020-11-07T05:33:54Z) - Generative Multi-Stream Architecture For American Sign Language
Recognition [15.717424753251674]
Training on datasets with low feature-richness for complex applications limit optimal convergence below human performance.
We propose a generative multistream architecture, eliminating the need for additional hardware with the intent to improve feature convergence without risking impracticability.
Our methods have achieved 95.62% validation accuracy with a variance of 1.42% from training, outperforming past models by 0.45% in validation accuracy and 5.53% in variance.
arXiv Detail & Related papers (2020-03-09T21:04:51Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z) - Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
and Recommendation [111.44445634272235]
In this paper, we develop a parameter efficient transfer learning architecture, termed as PeterRec.
PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks.
We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks.
arXiv Detail & Related papers (2020-01-13T14:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.