Global-Local Convolution with Spiking Neural Networks for Energy-efficient Keyword Spotting
- URL: http://arxiv.org/abs/2406.13179v1
- Date: Wed, 19 Jun 2024 03:19:25 GMT
- Title: Global-Local Convolution with Spiking Neural Networks for Energy-efficient Keyword Spotting
- Authors: Shuai Wang, Dehao Zhang, Kexin Shi, Yuchen Wang, Wenjie Wei, Jibin Wu, Malu Zhang,
- Abstract summary: We take advantage of spiking neural networks' energy efficiency and propose an end-to-end lightweight KWS model.
The model consists of two innovative modules: 1) Global-Local Spiking Convolution (GLSC) module and 2) Bottleneck-PLIF module.
The Bottleneck-PLIF module further processes the signals from GLSC with the aim to achieve higher accuracy with fewer parameters.
- Score: 17.795498397570675
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Thanks to Deep Neural Networks (DNNs), the accuracy of Keyword Spotting (KWS) has made substantial progress. However, as KWS systems are usually implemented on edge devices, energy efficiency becomes a critical requirement besides performance. Here, we take advantage of spiking neural networks' energy efficiency and propose an end-to-end lightweight KWS model. The model consists of two innovative modules: 1) Global-Local Spiking Convolution (GLSC) module and 2) Bottleneck-PLIF module. Compared to the hand-crafted feature extraction methods, the GLSC module achieves speech feature extraction that is sparser, more energy-efficient, and yields better performance. The Bottleneck-PLIF module further processes the signals from GLSC with the aim to achieve higher accuracy with fewer parameters. Extensive experiments are conducted on the Google Speech Commands Dataset (V1 and V2). The results show our method achieves competitive performance among SNN-based KWS models with fewer parameters.
Related papers
- SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in
Spiking Transformer [12.717450255837178]
Spiking Neural Network (SNN) has the advantages of low power consumption and high energy efficiency.
The most advanced SNN, Spikformer, combines the self-attention module from Transformer with SNN to achieve remarkable performance.
We present SparseSpikformer, a co-design framework aimed at achieving sparsity in Spikformer through token and weight pruning techniques.
arXiv Detail & Related papers (2023-11-15T09:22:52Z) - Systematic Architectural Design of Scale Transformed Attention Condenser
DNNs via Multi-Scale Class Representational Response Similarity Analysis [93.0013343535411]
We propose a novel type of analysis called Multi-Scale Class Representational Response Similarity Analysis (ClassRepSim)
We show that adding STAC modules to ResNet style architectures can result in up to a 1.6% increase in top-1 accuracy.
Results from ClassRepSim analysis can be used to select an effective parameterization of the STAC module resulting in competitive performance.
arXiv Detail & Related papers (2023-06-16T18:29:26Z) - Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning [19.978542231976636]
This paper proposes a novel method to reduce the parameters and FLOPs for computational efficiency in deep learning models.
We introduce accuracy and efficiency coefficients to control the trade-off between the accuracy of the network and its computing efficiency.
arXiv Detail & Related papers (2023-01-26T12:32:01Z) - BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to
Real-Network Performance [54.214426436283134]
Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications.
We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance.
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
arXiv Detail & Related papers (2022-11-13T18:31:45Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - IMDeception: Grouped Information Distilling Super-Resolution Network [7.6146285961466]
Single-Image-Super-Resolution (SISR) is a classical computer vision problem that has benefited from the recent advancements in deep learning methods.
In this work, we propose the Global Progressive Refinement Module (GPRM) as a less parameter-demanding alternative to the IIC module for feature aggregation.
We also propose Grouped Information Distilling Blocks (GIDB) to further decrease the number of parameters and floating point operations persecond (FLOPS)
Experiments reveal that the proposed network performs on par with state-of-the-art models despite having a limited number of parameters and FLOPS
arXiv Detail & Related papers (2022-04-25T06:43:45Z) - Ghost-dil-NetVLAD: A Lightweight Neural Network for Visual Place Recognition [3.6249801498927923]
We propose a lightweight weakly supervised end-to-end neural network consisting of a front-ended perception model called GhostCNN and a learnable VLAD layer as a back-end.
To enhance our proposed lightweight model further, we add dilated convolutions to the Ghost module to get features containing more spatial semantic information, improving accuracy.
arXiv Detail & Related papers (2021-12-22T06:05:02Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Neural Calibration for Scalable Beamforming in FDD Massive MIMO with
Implicit Channel Estimation [10.775558382613077]
Channel estimation and beamforming play critical roles in frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems.
We propose a deep learning-based approach that directly optimize the beamformers at the base station according to the received uplink pilots.
A neural calibration method is proposed to improve the scalability of the end-to-end design.
arXiv Detail & Related papers (2021-08-03T14:26:14Z) - GhostSR: Learning Ghost Features for Efficient Image Super-Resolution [49.393251361038025]
Single image super-resolution (SISR) system based on convolutional neural networks (CNNs) achieves fancy performance while requires huge computational costs.
We propose to use shift operation to generate the redundant features (i.e., Ghost features) of SISR models.
We show that both the non-compact and lightweight SISR models embedded in our proposed module can achieve comparable performance to that of their baselines.
arXiv Detail & Related papers (2021-01-21T10:09:47Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.