Related papers: BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

URL: http://arxiv.org/abs/2211.06987v1
Date: Sun, 13 Nov 2022 18:31:45 GMT
Title: BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
Authors: Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Zejun Ma, Jiakai Wang, Jie Luo, Xianglong Liu
Abstract summary: Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications. We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance. We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
Score: 54.214426436283134
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications while suffering expensive computation and storage. Therefore, network compression technologies like binarization are studied to deploy KWS models on edge. In this paper, we present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance. First, we present a Dual-scale Thinnable 1-bit-Architecture to recover the representation capability of the binarized computation units by dual-scale activation binarization and liberate the speedup potential from an overall architecture perspective. Second, we also construct a Frequency Independent Distillation scheme for KWS binarization-aware training, which distills the high and low-frequency components independently to mitigate the information mismatch between full-precision and binarized representations. Moreover, we implement BiFSMNv2 on ARMv8 real-world hardware with a novel Fast Bitwise Computation Kernel, which is proposed to fully utilize registers and increase instruction throughput. Comprehensive experiments show our BiFSMNv2 outperforms existing binary networks for KWS by convincing margins across different datasets and even achieves comparable accuracy with the full-precision networks (e.g., only 1.59% drop on Speech Commands V1-12). We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.

Related papers

BiVM: Accurate Binarized Neural Network for Efficient Video Matting [56.000594826508504]
Deep neural networks for real-time video matting suffer significant computational limitations on edge devices.<n>We present BiVM, an accurate and resource-efficient Binarized neural network for Video Matting.<n>BiVM surpasses alternative binarized video matting networks, including state-of-the-art (SOTA) binarization methods, by a substantial margin.
arXiv Detail & Related papers (2025-07-06T16:32:37Z)
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors [4.95475852994362]
We propose a new form of quantization to tile neural network layers with sequences of bits to achieve sub-bit compression of binary-weighted neural networks. We employ the approach to both fully-connected and convolutional layers, which make up the breadth of space in most neural architectures.
arXiv Detail & Related papers (2024-07-16T15:55:38Z)
Input Layer Binarization with Bit-Plane Encoding [4.872439392746007]
We present a new method to binarize the first layer using directly the 8-bit representation of input data. The resulting model is fully binarized and our first layer binarization approach is model independent.
arXiv Detail & Related papers (2023-05-04T14:49:07Z)
BiBench: Benchmarking and Analyzing Network Binarization [72.59760752906757]
Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings. Common challenges of binarization, such as accuracy degradation and efficiency limitation, suggest that its attributes are not fully understood. We present BiBench, a rigorously designed benchmark with in-depth analysis for network binarization.
arXiv Detail & Related papers (2023-01-26T17:17:16Z)
BiFSMN: Binary Neural Network for Keyword Spotting [47.46397208920726]
BiFSMN is an accurate and extreme-efficient binary neural network for KWS. We show that BiFSMN can achieve an impressive 22.3x speedup and 15.5x storage-saving on real-world edge hardware.
arXiv Detail & Related papers (2022-02-14T05:16:53Z)
High-Capacity Expert Binary Networks [56.87581500474093]
Network binarization is a promising hardware-aware direction for creating efficient deep models. Despite its memory and computational advantages, reducing the accuracy gap between binary models and their real-valued counterparts remains an unsolved challenging research problem. We propose Expert Binary Convolution, which, for the first time, tailors conditional computing to binary networks by learning to select one data-specific expert binary filter at a time conditioned on input features.
arXiv Detail & Related papers (2020-10-07T17:58:10Z)
SoFAr: Shortcut-based Fractal Architectures for Binary Convolutional Neural Networks [7.753767947048147]
We propose two Shortcut-based Fractal Architectures (SoFAr) specifically designed for Binary Convolutional Neural Networks (BCNNs) Our proposed SoFAr combines the adoption of shortcuts and the fractal architectures in one unified model, which is helpful in the training of BCNNs. Results show that our proposed SoFAr achieves better accuracy compared with shortcut-based BCNNs.
arXiv Detail & Related papers (2020-09-11T10:00:47Z)
Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart. We show how to build a strong baseline, which already achieves state-of-the-art accuracy. We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)
ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions [76.05981545084738]
We propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost. We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts. We show that the proposed ReActNet outperforms all the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-03-07T02:12:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.