Spiking Vocos: An Energy-Efficient Neural Vocoder
- URL: http://arxiv.org/abs/2509.13049v1
- Date: Tue, 16 Sep 2025 13:09:13 GMT
- Title: Spiking Vocos: An Energy-Efficient Neural Vocoder
- Authors: Yukun Chen, Zhaoxi Mu, Andong Li, Peilin Li, Xinyu Yang,
- Abstract summary: Spiking Vocos is a novel spiking neural vocoder with ultra-low energy consumption.<n>To mitigate the inherent information bottleneck in SNNs, we design a Spiking ConvNeXt module.<n>A lightweight Temporal Shift Module is also integrated to enhance the model's ability to fuse information across the temporal dimension.
- Score: 20.806942393453145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the remarkable progress in the synthesis speed and fidelity of neural vocoders, their high energy consumption remains a critical barrier to practical deployment on computationally restricted edge devices. Spiking Neural Networks (SNNs), widely recognized for their high energy efficiency due to their event-driven nature, offer a promising solution for low-resource scenarios. In this paper, we propose Spiking Vocos, a novel spiking neural vocoder with ultra-low energy consumption, built upon the efficient Vocos framework. To mitigate the inherent information bottleneck in SNNs, we design a Spiking ConvNeXt module to reduce Multiply-Accumulate (MAC) operations and incorporate an amplitude shortcut path to preserve crucial signal dynamics. Furthermore, to bridge the performance gap with its Artificial Neural Network (ANN) counterpart, we introduce a self-architectural distillation strategy to effectively transfer knowledge. A lightweight Temporal Shift Module is also integrated to enhance the model's ability to fuse information across the temporal dimension with negligible computational overhead. Experiments demonstrate that our model achieves performance comparable to its ANN counterpart, with UTMOS and PESQ scores of 3.74 and 3.45 respectively, while consuming only 14.7% of the energy. The source code is available at https://github.com/pymaster17/Spiking-Vocos.
Related papers
- Energy-Efficient Neuromorphic Computing for Edge AI: A Framework with Adaptive Spiking Neural Networks and Hardware-Aware Optimization [0.0]
NeuEdge is a framework that combines adaptive SNN models with hardware-aware optimization for edge deployment.<n>NeuEdge achieves 91-96% accuracy with up to 2.3 ms inference latency on edge hardware and an estimated 847 GOp/s/W energy efficiency.
arXiv Detail & Related papers (2026-02-02T18:34:48Z) - SpikeX: Exploring Accelerator Architecture and Network-Hardware Co-Optimization for Sparse Spiking Neural Networks [3.758294848902233]
We propose a novel systolic-array SNN accelerator architecture, called SpikeX, to take on the challenges and opportunities stemming from unstructured sparsity.<n>SpikeX reduces memory access and increases data sharing and hardware utilization targeting computations spanning both time and space.
arXiv Detail & Related papers (2025-05-18T08:07:44Z) - TS-SNN: Temporal Shift Module for Spiking Neural Networks [12.35332483263129]
Spiking Neural Networks (SNNs) are recognized for their biological plausibility and energy efficiency.<n>We introduce Temporal Shift module for Spiking Neural Networks (TS-SNN), which incorporates a novel Temporal Shift (TS) module to integrate past, present, and future spike features.<n>TS-SNN achieves state-of-the-art performance on benchmarks like CIFAR-10 (96.72%), CIFAR-100 (80.28%), and ImageNet (70.61%) with fewer timesteps, while maintaining low energy consumption.
arXiv Detail & Related papers (2025-05-07T06:34:34Z) - Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity [39.483346492111515]
Linear recurrent neural networks enable powerful long-range sequence modeling with constant memory usage and time-per-token during inference.<n>Unstructured sparsity offers a compelling solution, enabling substantial reductions in compute and memory requirements when accelerated by compatible hardware platforms.<n>We find that highly sparse linear RNNs consistently achieve better efficiency-performance trade-offs than dense baselines.
arXiv Detail & Related papers (2025-02-03T13:09:21Z) - Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks.<n> embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption.<n> split computing - where an SNN is partitioned across two devices - is a promising solution.<n>This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z) - sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection
with Spiking Neural Networks [51.516451451719654]
Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient.
This paper introduces a novel SNN-based Voice Activity Detection model, referred to as sVAD.
It provides effective auditory feature representation through SincNet and 1D convolution, and improves noise robustness with attention mechanisms.
arXiv Detail & Related papers (2024-03-09T02:55:44Z) - LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient.
We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z) - A Resource-efficient Spiking Neural Network Accelerator Supporting
Emerging Neural Encoding [6.047137174639418]
Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing.
SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models.
We present a novel hardware architecture that can efficiently support SNN with emerging neural encoding.
arXiv Detail & Related papers (2022-06-06T10:56:25Z) - Event-based Video Reconstruction via Potential-assisted Spiking Neural
Network [48.88510552931186]
Bio-inspired neural networks can potentially lead to greater computational efficiency on event-driven hardware.
We propose a novel Event-based Video reconstruction framework based on a fully Spiking Neural Network (EVSNN)
We find that the spiking neurons have the potential to store useful temporal information (memory) to complete such time-dependent tasks.
arXiv Detail & Related papers (2022-01-25T02:05:20Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Energy-Efficient Model Compression and Splitting for Collaborative
Inference Over Time-Varying Channels [52.60092598312894]
We propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes.
Our proposed solution results in minimal energy consumption and $CO$ emission compared to the considered baselines.
arXiv Detail & Related papers (2021-06-02T07:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.