Related papers: The Hidden Power of Pure 16-bit Floating-Point Neural Networks

The Hidden Power of Pure 16-bit Floating-Point Neural Networks

URL: http://arxiv.org/abs/2301.12809v2
Date: Fri, 3 May 2024 02:56:49 GMT
Title: The Hidden Power of Pure 16-bit Floating-Point Neural Networks
Authors: Juyoung Yun, Byungkon Kang, Zhoulai Fu,
Abstract summary: Lowering the precision of neural networks from the prevalent 32-bit precision has long been considered harmful to performance. This paper investigates the unexpected performance gain of pure 16-bit neural networks over the 32-bit networks in classification tasks.
Score: 1.9594704501292781
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Lowering the precision of neural networks from the prevalent 32-bit precision has long been considered harmful to performance, despite the gain in space and time. Many works propose various techniques to implement half-precision neural networks, but none study pure 16-bit settings. This paper investigates the unexpected performance gain of pure 16-bit neural networks over the 32-bit networks in classification tasks. We present extensive experimental results that favorably compare various 16-bit neural networks' performance to those of the 32-bit models. In addition, a theoretical analysis of the efficiency of 16-bit models is provided, which is coupled with empirical evidence to back it up. Finally, we discuss situations in which low-precision training is indeed detrimental.

Related papers

Standalone 16-bit Training: Missing Study for Hardware-Limited Deep Learning Practitioners [2.075190620803526]
Mixed precision techniques leverage different numerical precisions during model training and inference to optimize resource usage. For many with limited resources, the available options are restricted to using 32-bit, 16-bit, or a combination of the two. This study fills a critical gap, proving for the first time that standalone 16-bit precision neural networks match 32-bit and mixed-precision in accuracy.
arXiv Detail & Related papers (2023-05-18T13:09:45Z)
RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs [9.807687918954763]
Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks. RedBit is an open-source framework that provides a transparent, easy-to-use interface to evaluate the effectiveness of different algorithms on network accuracy.
arXiv Detail & Related papers (2023-01-15T21:27:35Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
PositNN: Training Deep Neural Networks with Mixed Low-Precision Posit [5.534626267734822]
The presented research aims to evaluate the feasibility to train deep convolutional neural networks using posits. A software framework was developed to use simulated posits and quires in end-to-end training and inference. Results suggest that 8-bit posits can substitute 32-bit floats during training with no negative impact on the resulting loss and accuracy.
arXiv Detail & Related papers (2021-04-30T19:30:37Z)
Scalable Verification of Quantized Neural Networks (Technical Report) [14.04927063847749]
We show that bit-exact implementation of quantized neural networks with bit-vector specifications is PSPACE-hard. We propose three techniques for making SMT-based verification of quantized neural networks more scalable.
arXiv Detail & Related papers (2020-12-15T10:05:37Z)
Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets [65.28292822614418]
Giant formula for simultaneously enlarging the resolution, depth and width provides us a Rubik's cube for neural networks. This paper aims to explore the twisting rules for obtaining deep neural networks with minimum model sizes and computational costs.
arXiv Detail & Related papers (2020-10-28T08:49:45Z)
Searching for Low-Bit Weights in Quantized Neural Networks [129.8319019563356]
Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. We present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately.
arXiv Detail & Related papers (2020-09-18T09:13:26Z)
WrapNet: Neural Net Inference with Ultra-Low-Resolution Arithmetic [57.07483440807549]
We propose a method that adapts neural networks to use low-resolution (8-bit) additions in the accumulators, achieving classification accuracy comparable to their 32-bit counterparts. We demonstrate the efficacy of our approach on both software and hardware platforms.
arXiv Detail & Related papers (2020-07-26T23:18:38Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.