Any-Width Networks
- URL: http://arxiv.org/abs/2012.03153v1
- Date: Sun, 6 Dec 2020 00:22:01 GMT
- Title: Any-Width Networks
- Authors: Thanh Vu, Marc Eder, True Price, Jan-Michael Frahm
- Abstract summary: We propose an adjustable-width CNN architecture that allows for fine-grained control over speed and accuracy during inference.
Our key innovation is the use of lower-triangular weight matrices which explicitly address width-varying batch statistics.
We empirically demonstrate that our proposed AWNs compare favorably to existing methods while providing maximally granular control during inference.
- Score: 43.98007529334065
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite remarkable improvements in speed and accuracy, convolutional neural
networks (CNNs) still typically operate as monolithic entities at inference
time. This poses a challenge for resource-constrained practical applications,
where both computational budgets and performance needs can vary with the
situation. To address these constraints, we propose the Any-Width Network
(AWN), an adjustable-width CNN architecture and associated training routine
that allow for fine-grained control over speed and accuracy during inference.
Our key innovation is the use of lower-triangular weight matrices which
explicitly address width-varying batch statistics while being naturally suited
for multi-width operations. We also show that this design facilitates an
efficient training routine based on random width sampling. We empirically
demonstrate that our proposed AWNs compare favorably to existing methods while
providing maximally granular control during inference.
Related papers
- RTF-Q: Efficient Unsupervised Domain Adaptation with Retraining-free Quantization [14.447148108341688]
We propose efficient unsupervised domain adaptation with ReTraining-Free Quantization (RTF-Q)
Our approach uses low-precision quantization architectures with varying computational costs, adapting to devices with dynamic budgets.
We demonstrate that our network achieves competitive accuracy with state-of-the-art methods across three benchmarks.
arXiv Detail & Related papers (2024-08-11T11:53:29Z) - AdaQAT: Adaptive Bit-Width Quantization-Aware Training [0.873811641236639]
Large-scale deep neural networks (DNNs) have achieved remarkable success in many application scenarios.
Model quantization is a common approach to deal with deployment constraints, but searching for optimized bit-widths can be challenging.
We present Adaptive Bit-Width Quantization Aware Training (AdaQAT), a learning-based method that automatically optimize bit-widths during training for more efficient inference.
arXiv Detail & Related papers (2024-04-22T09:23:56Z) - Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - Feature-Learning Networks Are Consistent Across Widths At Realistic
Scales [72.27228085606147]
We study the effect of width on the dynamics of feature-learning neural networks across a variety of architectures and datasets.
Early in training, wide neural networks trained on online data have not only identical loss curves but also agree in their point-wise test predictions throughout training.
We observe, however, that ensembles of narrower networks perform worse than a single wide network.
arXiv Detail & Related papers (2023-05-28T17:09:32Z) - Training Integer-Only Deep Recurrent Neural Networks [3.1829446824051195]
We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN)
Our approach supports layer normalization, attention, and an adaptive piecewise linear (PWL) approximation of activation functions.
The proposed method enables RNN-based language models to run on edge devices with $2times$ improvement in runtime.
arXiv Detail & Related papers (2022-12-22T15:22:36Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Standard Deviation-Based Quantization for Deep Neural Networks [17.495852096822894]
Quantization of deep neural networks is a promising approach that reduces the inference cost.
We propose a new framework to learn the quantization intervals (discrete values) using the knowledge of the network's weight and activation distributions.
Our scheme simultaneously prunes the network's parameters and allows us to flexibly adjust the pruning ratio during the quantization process.
arXiv Detail & Related papers (2022-02-24T23:33:47Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - SRDCNN: Strongly Regularized Deep Convolution Neural Network
Architecture for Time-series Sensor Signal Classification Tasks [4.950427992960756]
We present SRDCNN: Strongly Regularized Deep Convolution Neural Network (DCNN) based deep architecture to perform time series classification tasks.
The novelty of the proposed approach is that the network weights are regularized by both L1 and L2 norm penalties.
arXiv Detail & Related papers (2020-07-14T08:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.