Related papers: Combinatorial optimization for low bit-width neural networks

Combinatorial optimization for low bit-width neural networks

URL: http://arxiv.org/abs/2206.02006v1
Date: Sat, 4 Jun 2022 15:02:36 GMT
Title: Combinatorial optimization for low bit-width neural networks
Authors: Han Zhou, Aida Ashrafi and Matthew B. Blaschko
Abstract summary: Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources. Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting. We show that a combination of greedy coordinate descent and this novel approach can attain competitive accuracy on binary classification tasks.
Score: 23.466606660363016
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Low-bit width neural networks have been extensively explored for deployment on edge devices to reduce computational resources. Existing approaches have focused on gradient-based optimization in a two-stage train-and-compress setting or as a combined optimization where gradients are quantized during training. Such schemes require high-performance hardware during the training phase and usually store an equivalent number of full-precision weights apart from the quantized weights. In this paper, we explore methods of direct combinatorial optimization in the problem of risk minimization with binary weights, which can be made equivalent to a non-monotone submodular maximization under certain conditions. We employ an approximation algorithm for the cases with single and multilayer neural networks. For linear models, it has $\mathcal{O}(nd)$ time complexity where $n$ is the sample size and $d$ is the data dimension. We show that a combination of greedy coordinate descent and this novel approach can attain competitive accuracy on binary classification tasks.

Related papers

A foundation for exact binarized morphological neural networks [2.8925699537310137]
Training and running deep neural networks (NNs) often demands a lot of computation and energy-intensive specialized hardware. One way to reduce the computation and power cost is to use binary weight NNs, but these are hard to train because the sign function has a non-smooth gradient. We present a model based on Mathematical Morphology (MM), which can binarize ConvNets without losing performance under certain conditions.
arXiv Detail & Related papers (2024-01-08T11:37:44Z)
Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures [85.76673783330334]
Two different settings of linear weight-sharing layers motivate two flavours of Kronecker-Factored Approximate Curvature (K-FAC) We show they are exact for deep linear networks with weight-sharing in their respective setting. We observe little difference between these two K-FAC variations when using them to train both a graph neural network and a vision transformer.
arXiv Detail & Related papers (2023-11-01T16:37:00Z)
On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee [21.818773423324235]
This paper focuses on two model compression techniques: low-rank approximation and weight approximation. In this paper, a holistic framework is proposed for model compression from a novel perspective of non optimization.
arXiv Detail & Related papers (2023-03-13T02:14:42Z)
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks [42.76623191830371]
We propose a gossip-based distributed bilevel optimization algorithm. Agents can solve both networked and outer problems in a single time. Our algorithm achieves the state-of-the-art efficiency and test accuracy.
arXiv Detail & Related papers (2022-06-22T06:38:54Z)
Algorithms for Efficiently Learning Low-Rank Neural Networks [12.916132936159713]
We study algorithms for learning low-rank neural networks. We present a provably efficient algorithm which learns an optimal low-rank approximation to a single-hidden-layer ReLU network. We propose a novel low-rank framework for training low-rank $textitdeep$ networks.
arXiv Detail & Related papers (2022-02-02T01:08:29Z)
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [67.33850633281803]
We present a versatile new input encoding that permits the use of a smaller network without sacrificing quality. A small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through a gradient descent. We achieve a combined speed of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds.
arXiv Detail & Related papers (2022-01-16T07:22:47Z)
Joint inference and input optimization in equilibrium networks [68.63726855991052]
deep equilibrium model is a class of models that foregoes traditional network depth and instead computes the output of a network by finding the fixed point of a single nonlinear layer. We show that there is a natural synergy between these two settings. We demonstrate this strategy on various tasks such as training generative models while optimizing over latent codes, training models for inverse problems like denoising and inpainting, adversarial training and gradient based meta-learning.
arXiv Detail & Related papers (2021-11-25T19:59:33Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
FracBits: Mixed Precision Quantization via Fractional Bit-Widths [29.72454879490227]
Mixed precision quantization is favorable with customized hardwares supporting arithmetic operations at multiple bit-widths. We propose a novel learning-based algorithm to derive mixed precision models end-to-end under target computation constraints.
arXiv Detail & Related papers (2020-07-04T06:09:09Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Optimal Gradient Quantization Condition for Communication-Efficient Distributed Training [99.42912552638168]
Communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications. In this work, we deduce the optimal condition of both the binary and multi-level gradient quantization for textbfANY gradient distribution. Based on the optimal condition, we develop two novel quantization schemes: biased BinGrad and unbiased ORQ for binary and multi-level gradient quantization respectively.
arXiv Detail & Related papers (2020-02-25T18:28:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.