Related papers: Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction

Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction

URL: http://arxiv.org/abs/2507.10078v2
Date: Wed, 30 Jul 2025 11:57:54 GMT
Title: Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction
Authors: Hiroki Sakamoto, Kazuhiro Sato,
Abstract summary: Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data.<n>Large parameter sizes pose challenges for deployment on resource-constrained devices.<n>We propose an efficient parameter reduction method for these models by applying $H2$ model order reduction techniques.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this study, we propose an efficient parameter reduction method for these models by applying $H^{2}$ model order reduction techniques from control theory to their linear SSM components. In experiments, the LRA benchmark results show that the model compression based on our proposed method outperforms an existing method using the Balanced Truncation, while successfully reducing the number of parameters in the SSMs to $1/32$ without sacrificing the performance of the original models.

Related papers

Nonlinear Model Order Reduction of Dynamical Systems in Process Engineering: Review and Comparison [50.0791489606211]
We review state-of-the-art nonlinear model order reduction methods.<n>We discuss both general-purpose methods and tailored approaches for (chemical) process systems.
arXiv Detail & Related papers (2025-06-15T11:39:12Z)
DAPLSR: Data Augmentation Partial Least Squares Regression Model via Manifold Optimization [6.200365627295667]
This paper proposes a Data Augmentation Partial Least Squares Regression model via manifold optimization.<n>The proposed DAPLSR model achieves superior classification performance and outstanding evaluation metrics on various datasets.
arXiv Detail & Related papers (2025-04-23T11:58:28Z)
Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We propose a new family of algorithms that uses the linear minimization oracle (LMO) to adapt to the geometry of the problem.<n>We demonstrate significant speedups on nanoGPT training using our algorithm, Scion, without any reliance on Adam.
arXiv Detail & Related papers (2025-02-11T13:10:34Z)
Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality.<n>We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z)
Parameter-Efficient Fine-Tuning of State Space Models [10.817729275974829]
Deep State Space Models (SSMs) have become powerful tools for language modeling, offering high performance and linear scalability with sequence length.<n>This paper investigates the application of parameter-efficient fine-tuning (PEFT) methods to SSM-based models.<n>We propose Sparse Dimension Tuning (SDT), a PEFT method tailored for SSM modules.
arXiv Detail & Related papers (2024-10-11T17:30:28Z)
Zeroth-Order Fine-Tuning of LLMs in Random Subspaces [66.27334633749734]
As language models grow in size, memory demands for backpropagation increase. Zeroth-order (ZOZO) optimization methods offer a memory-efficient alternative. We show that SubZero enhances fine-tuning and achieves faster results compared to standard ZOZO approaches.
arXiv Detail & Related papers (2024-10-11T17:01:43Z)
LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.<n>We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.<n>Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z)
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors. We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods. Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z)
Model order reduction of deep structured state-space models: A system-theoretic approach [0.0]
deep structured state-space models offer high predictive performance. The learned representations often suffer from excessively large model orders, which render them unsuitable for control design purposes. We introduce two regularization terms which can be incorporated into the training loss for improved model order reduction. The presented regularizers lead to advantages in terms of parsimonious representations and faster inference resulting from the reduced order models.
arXiv Detail & Related papers (2024-03-21T21:05:59Z)
Data-free Weight Compress and Denoise for Large Language Models [96.68582094536032]
We propose a novel approach termed Data-free Joint Rank-k Approximation for compressing the parameter matrices.<n>We achieve a model pruning of 80% parameters while retaining 93.43% of the original performance without any calibration data.
arXiv Detail & Related papers (2024-02-26T05:51:47Z)
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models [9.91972450276408]
This paper introduces an innovative approach for the parametric and practical compression of Large Language Models (LLMs) based on reduced order modelling. Our method represents a significant advancement in model compression by leveraging matrix decomposition, demonstrating superior efficacy compared to the prevailing state-of-the-art structured pruning method.
arXiv Detail & Related papers (2023-12-12T07:56:57Z)
An iterative multi-fidelity approach for model order reduction of multi-dimensional input parametric PDE systems [0.0]
We propose a sampling parametric strategy for the reduction of large-scale PDE systems with multidimensional input parametric spaces. It is achieved by exploiting low-fidelity models throughout the parametric space to sample points using an efficient sampling strategy. Since the proposed methodology leverages the use of low-fidelity models to assimilate the solution database, it significantly reduces the computational cost in the offline stage.
arXiv Detail & Related papers (2023-01-23T15:25:58Z)
A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning [50.910152564914405]
Existing posterior sampling methods for reinforcement learning are limited by being model-based or lack worst-case theoretical guarantees beyond linear MDPs. This paper proposes a new model-free formulation of posterior sampling that applies to more general episodic reinforcement learning problems with theoretical guarantees.
arXiv Detail & Related papers (2022-08-23T12:21:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.