Related papers: A Model Compression Method with Matrix Product Operators for Speech Enhancement

A Model Compression Method with Matrix Product Operators for Speech Enhancement

URL: http://arxiv.org/abs/2010.04950v1
Date: Sat, 10 Oct 2020 08:53:25 GMT
Title: A Model Compression Method with Matrix Product Operators for Speech Enhancement
Authors: Xingwei Sun, Ze-Feng Gao, Zhong-Yi Lu, Junfeng Li, Yonghong Yan
Abstract summary: We propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in neural network models for speech enhancement. Our proposal provides an effective model compression method for speech enhancement, especially in cloud-free application.
Score: 15.066942043773267
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The deep neural network (DNN) based speech enhancement approaches have achieved promising performance. However, the number of parameters involved in these methods is usually enormous for the real applications of speech enhancement on the device with the limited resources. This seriously restricts the applications. To deal with this issue, model compression techniques are being widely studied. In this paper, we propose a model compression method based on matrix product operators (MPO) to substantially reduce the number of parameters in DNN models for speech enhancement. In this method, the weight matrices in the linear transformations of neural network model are replaced by the MPO decomposition format before training. In experiment, this process is applied to the causal neural network models, such as the feedforward multilayer perceptron (MLP) and long short-term memory (LSTM) models. Both MLP and LSTM models with/without compression are then utilized to estimate the ideal ratio mask for monaural speech enhancement. The experimental results show that our proposed MPO-based method outperforms the widely-used pruning method for speech enhancement under various compression rates, and further improvement can be achieved with respect to low compression rates. Our proposal provides an effective model compression method for speech enhancement, especially in cloud-free application.

Related papers

Radio: Rate-Distortion Optimization for Large Language Model Compression [6.719003232695071]
Compression of large language models (LLMs) has emerged as a key problem in facilitating LLM deployment on resource-limited devices.<n>We propose a quantization technique based on simple rate-distortion optimization.<n>Our technique scales to models containing hundreds of billions of weight parameters and offers users the flexibility to compress models, post-training, to a model size or accuracy specified by the user.
arXiv Detail & Related papers (2025-05-05T21:17:14Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations. Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method. We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation. Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z)
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models [66.1595537904019]
Large language models (LLMs) can act as gradient priors in a zero-shot setting. We introduce LM-GC, a novel method that integrates LLMs with arithmetic coding.
arXiv Detail & Related papers (2024-09-26T13:38:33Z)
Modeling Time-Variant Responses of Optical Compressors with Selective State Space Models [0.0]
This paper presents a method for modeling optical dynamic range compressors using deep neural networks with Selective State Space models. It features a refined technique integrating Feature-wise Linear Modulation and Gated Linear Units to adjust the network dynamically. The proposed architecture is well-suited for low-latency and real-time applications, crucial in live audio processing.
arXiv Detail & Related papers (2024-08-22T17:03:08Z)
MoDeGPT: Modular Decomposition for Large Language Model Compression [59.361006801465344]
This paper introduces textbfModular bfDecomposition (MoDeGPT), a novel structured compression framework. MoDeGPT partitions the Transformer block into modules comprised of matrix pairs and reduces the hidden dimensions. Our experiments show MoDeGPT, without backward propagation, matches or surpasses previous structured compression methods.
arXiv Detail & Related papers (2024-08-19T01:30:14Z)
MCNC: Manifold-Constrained Reparameterization for Neural Compression [21.70510507535041]
We present a novel model compression method, which we term Manifold-Constrained Neural Compression (MCNC) By constraining the parameter space to our proposed manifold, we can identify high-quality solutions. Our method significantly outperforms state-of-the-art baselines in terms of compression, accuracy, and/or model reconstruction time.
arXiv Detail & Related papers (2024-06-27T16:17:26Z)
A Survey on Transformer Compression [84.18094368700379]
Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV) Model compression methods reduce the memory and computational cost of Transformer. This survey provides a comprehensive review of recent compression methods, with a specific focus on their application to Transformer-based models.
arXiv Detail & Related papers (2024-02-05T12:16:28Z)
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks [1.5199992713356987]
This paper introduces CompactifAI, an innovative compression approach using quantum-inspired networks. Our method is versatile and can be implemented with - or on top of - other compression techniques. As a benchmark, we demonstrate that a combination of CompactifAI with quantization allows to reduce a 93% memory size of LlaMA 7B.
arXiv Detail & Related papers (2024-01-25T11:45:21Z)
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models [9.91972450276408]
This paper introduces an innovative approach for the parametric and practical compression of Large Language Models (LLMs) based on reduced order modelling. Our method represents a significant advancement in model compression by leveraging matrix decomposition, demonstrating superior efficacy compared to the prevailing state-of-the-art structured pruning method.
arXiv Detail & Related papers (2023-12-12T07:56:57Z)
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models [56.00939852727501]
Minimally-supervised speech synthesis decouples TTS by combining two types of discrete speech representations. Non-autoregressive framework enhances controllability, and duration diffusion model enables diversified prosodic expression.
arXiv Detail & Related papers (2023-09-27T09:27:03Z)
Exploring Effective Mask Sampling Modeling for Neural Image Compression [171.35596121939238]
Most existing neural image compression methods rely on side information from hyperprior or context models to eliminate spatial redundancy. Inspired by the mask sampling modeling in recent self-supervised learning methods for natural language processing and high-level vision, we propose a novel pretraining strategy for neural image compression. Our method achieves competitive performance with lower computational complexity compared to state-of-the-art image compression methods.
arXiv Detail & Related papers (2023-06-09T06:50:20Z)
Modality-Agnostic Variational Compression of Implicit Neural Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR) Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism. After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z)
Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators [31.461762905053426]
We present a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned.
arXiv Detail & Related papers (2021-06-04T01:50:15Z)
Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms. Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics. We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.