A Model Compression Method with Matrix Product Operators for Speech
Enhancement
- URL: http://arxiv.org/abs/2010.04950v1
- Date: Sat, 10 Oct 2020 08:53:25 GMT
- Title: A Model Compression Method with Matrix Product Operators for Speech
Enhancement
- Authors: Xingwei Sun, Ze-Feng Gao, Zhong-Yi Lu, Junfeng Li, Yonghong Yan
- Abstract summary: We propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in neural network models for speech enhancement.
Our proposal provides an effective model compression method for speech enhancement, especially in cloud-free application.
- Score: 15.066942043773267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The deep neural network (DNN) based speech enhancement approaches have
achieved promising performance. However, the number of parameters involved in
these methods is usually enormous for the real applications of speech
enhancement on the device with the limited resources. This seriously restricts
the applications. To deal with this issue, model compression techniques are
being widely studied. In this paper, we propose a model compression method
based on matrix product operators (MPO) to substantially reduce the number of
parameters in DNN models for speech enhancement. In this method, the weight
matrices in the linear transformations of neural network model are replaced by
the MPO decomposition format before training. In experiment, this process is
applied to the causal neural network models, such as the feedforward multilayer
perceptron (MLP) and long short-term memory (LSTM) models. Both MLP and LSTM
models with/without compression are then utilized to estimate the ideal ratio
mask for monaural speech enhancement. The experimental results show that our
proposed MPO-based method outperforms the widely-used pruning method for speech
enhancement under various compression rates, and further improvement can be
achieved with respect to low compression rates. Our proposal provides an
effective model compression method for speech enhancement, especially in
cloud-free application.
Related papers
- Language Models as Zero-shot Lossless Gradient Compressors: Towards
General Neural Parameter Prior Models [66.1595537904019]
Large language models (LLMs) can act as gradient priors in a zero-shot setting.
We introduce LM-GC, a novel method that integrates LLMs with arithmetic coding.
arXiv Detail & Related papers (2024-09-26T13:38:33Z) - Modeling Time-Variant Responses of Optical Compressors with Selective State Space Models [0.0]
This paper presents a method for modeling optical dynamic range compressors using deep neural networks with Selective State Space models.
It features a refined technique integrating Feature-wise Linear Modulation and Gated Linear Units to adjust the network dynamically.
The proposed architecture is well-suited for low-latency and real-time applications, crucial in live audio processing.
arXiv Detail & Related papers (2024-08-22T17:03:08Z) - MoDeGPT: Modular Decomposition for Large Language Model Compression [59.361006801465344]
This paper introduces textbfModular bfDecomposition (MoDeGPT), a novel structured compression framework.
MoDeGPT partitions the Transformer block into modules comprised of matrix pairs and reduces the hidden dimensions.
Our experiments show MoDeGPT, without backward propagation, matches or surpasses previous structured compression methods.
arXiv Detail & Related papers (2024-08-19T01:30:14Z) - A Survey on Transformer Compression [84.18094368700379]
Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV)
Model compression methods reduce the memory and computational cost of Transformer.
This survey provides a comprehensive review of recent compression methods, with a specific focus on their application to Transformer-based models.
arXiv Detail & Related papers (2024-02-05T12:16:28Z) - CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks [1.5199992713356987]
This paper introduces CompactifAI, an innovative compression approach using quantum-inspired networks.
Our method is versatile and can be implemented with - or on top of - other compression techniques.
As a benchmark, we demonstrate that a combination of CompactifAI with quantization allows to reduce a 93% memory size of LlaMA 7B.
arXiv Detail & Related papers (2024-01-25T11:45:21Z) - Rethinking Compression: Reduced Order Modelling of Latent Features in
Large Language Models [9.91972450276408]
This paper introduces an innovative approach for the parametric and practical compression of Large Language Models (LLMs) based on reduced order modelling.
Our method represents a significant advancement in model compression by leveraging matrix decomposition, demonstrating superior efficacy compared to the prevailing state-of-the-art structured pruning method.
arXiv Detail & Related papers (2023-12-12T07:56:57Z) - High-Fidelity Speech Synthesis with Minimal Supervision: All Using
Diffusion Models [56.00939852727501]
Minimally-supervised speech synthesis decouples TTS by combining two types of discrete speech representations.
Non-autoregressive framework enhances controllability, and duration diffusion model enables diversified prosodic expression.
arXiv Detail & Related papers (2023-09-27T09:27:03Z) - Exploring Effective Mask Sampling Modeling for Neural Image Compression [171.35596121939238]
Most existing neural image compression methods rely on side information from hyperprior or context models to eliminate spatial redundancy.
Inspired by the mask sampling modeling in recent self-supervised learning methods for natural language processing and high-level vision, we propose a novel pretraining strategy for neural image compression.
Our method achieves competitive performance with lower computational complexity compared to state-of-the-art image compression methods.
arXiv Detail & Related papers (2023-06-09T06:50:20Z) - Modality-Agnostic Variational Compression of Implicit Neural
Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR)
Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism.
After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z) - Enabling Lightweight Fine-tuning for Pre-trained Language Model
Compression based on Matrix Product Operators [31.461762905053426]
We present a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics.
Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned.
arXiv Detail & Related papers (2021-06-04T01:50:15Z) - Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms.
Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics.
We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.