BiFSMN: Binary Neural Network for Keyword Spotting
- URL: http://arxiv.org/abs/2202.06483v2
- Date: Tue, 15 Feb 2022 01:54:22 GMT
- Title: BiFSMN: Binary Neural Network for Keyword Spotting
- Authors: Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Yao Tian,
Zejun Ma, Jie Luo, Xianglong Liu
- Abstract summary: BiFSMN is an accurate and extreme-efficient binary neural network for KWS.
We show that BiFSMN can achieve an impressive 22.3x speedup and 15.5x storage-saving on real-world edge hardware.
- Score: 47.46397208920726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The deep neural networks, such as the Deep-FSMN, have been widely studied for
keyword spotting (KWS) applications. However, computational resources for these
networks are significantly constrained since they usually run on-call on edge
devices. In this paper, we present BiFSMN, an accurate and extreme-efficient
binary neural network for KWS. We first construct a High-frequency Enhancement
Distillation scheme for the binarization-aware training, which emphasizes the
high-frequency information from the full-precision network's representation
that is more crucial for the optimization of the binarized network. Then, to
allow the instant and adaptive accuracy-efficiency trade-offs at runtime, we
also propose a Thinnable Binarization Architecture to further liberate the
acceleration potential of the binarized network from the topology perspective.
Moreover, we implement a Fast Bitwise Computation Kernel for BiFSMN on ARMv8
devices which fully utilizes registers and increases instruction throughput to
push the limit of deployment efficiency. Extensive experiments show that BiFSMN
outperforms existing binarization methods by convincing margins on various
datasets and is even comparable with the full-precision counterpart (e.g., less
than 3% drop on Speech Commands V1-12). We highlight that benefiting from the
thinnable architecture and the optimized 1-bit implementation, BiFSMN can
achieve an impressive 22.3x speedup and 15.5x storage-saving on real-world edge
hardware.
Related papers
- BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices.
We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers.
Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z) - Signed Binary Weight Networks [17.07866119979333]
Two important algorithmic techniques have shown promise for enabling efficient inference - sparsity and binarization.
We propose a new method called signed-binary networks to improve efficiency further.
Our method achieves comparable accuracy on ImageNet and CIFAR10 datasets with binary and can lead to 69% sparsity.
arXiv Detail & Related papers (2022-11-25T00:19:21Z) - BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to
Real-Network Performance [54.214426436283134]
Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications.
We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance.
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
arXiv Detail & Related papers (2022-11-13T18:31:45Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - ReActNet: Towards Precise Binary Neural Network with Generalized
Activation Functions [76.05981545084738]
We propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost.
We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts.
We show that the proposed ReActNet outperforms all the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-03-07T02:12:02Z) - Exploring the Connection Between Binary and Spiking Neural Networks [1.329054857829016]
We bridge the recent algorithmic progress in training Binary Neural Networks and Spiking Neural Networks.
We show that training Spiking Neural Networks in the extreme quantization regime results in near full precision accuracies on large-scale datasets.
arXiv Detail & Related papers (2020-02-24T03:46:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.