Related papers: Lossless compression with state space models using bits back coding

Lossless compression with state space models using bits back coding

URL: http://arxiv.org/abs/2103.10150v2
Date: Fri, 19 Mar 2021 10:53:45 GMT
Title: Lossless compression with state space models using bits back coding
Authors: James Townsend, Iain Murray
Abstract summary: We generalize the 'bits back with ANS' method to time-series models with a latent Markov structure. We provide experimental evidence that our method is effective for small scale models, and discuss its applicability to larger scale settings such as video compression.
Score: 17.625326990547332
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: We generalize the 'bits back with ANS' method to time-series models with a latent Markov structure. This family of models includes hidden Markov models (HMMs), linear Gaussian state space models (LGSSMs) and many more. We provide experimental evidence that our method is effective for small scale models, and discuss its applicability to larger scale settings such as video compression.

Related papers

LatentLLM: Attention-Aware Joint Tensor Compression [50.33925662486034]
Large language models (LLMs) and large multi-modal models (LMMs) require a massive amount of computational and memory resources.<n>We propose a new framework to convert such LLMs/LMMs into a reduced-dimension latent structure.
arXiv Detail & Related papers (2025-05-23T22:39:54Z)
Autoregressive pairwise Graphical Models efficiently find ground state representations of stoquastic Hamiltonians [6.10240618821149]
We introduce Autoregressive Graphical Models (AGMs) as an Ansatz for modeling the ground states of stoquastic Hamiltonians.<n>We find that simple AGMs with pairwise energy functions trained using first-order gradient methods often outperform more complex non-linear models trained using the more expensive reconfiguration method.
arXiv Detail & Related papers (2025-05-11T00:36:48Z)
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning [54.584665518334035]
Hybrid architectures that combine Attention and State Space Models (SSMs) achieve state-of-the-art accuracy and runtime performance. Recent work has demonstrated that applying compression and distillation to Attention-only models yields smaller, more accurate models at a fraction of the training cost. We introduce a novel group-aware pruning strategy that preserves the structural integrity of SSM blocks and their sequence modeling capabilities.
arXiv Detail & Related papers (2025-04-15T17:26:29Z)
Diffusion Product Quantization [18.32568431229839]
We explore the quantization of diffusion models in extreme compression regimes to reduce model size while maintaining performance. We apply our compression method to the DiT model on ImageNet and consistently outperform other quantization approaches.
arXiv Detail & Related papers (2024-11-19T07:47:37Z)
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective [52.778766190479374]
Latent-based image generative models have achieved notable success in image generation tasks. Despite sharing the same latent space, autoregressive models significantly lag behind LDMs and MIMs in image generation. We propose a simple but effective discrete image tokenizer to stabilize the latent space for image generative modeling.
arXiv Detail & Related papers (2024-10-16T12:13:17Z)
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models [9.91972450276408]
This paper introduces an innovative approach for the parametric and practical compression of Large Language Models (LLMs) based on reduced order modelling. Our method represents a significant advancement in model compression by leveraging matrix decomposition, demonstrating superior efficacy compared to the prevailing state-of-the-art structured pruning method.
arXiv Detail & Related papers (2023-12-12T07:56:57Z)
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing [83.63107444454938]
We propose a consistency-regularized ensemble learning approach based on perturbed models, named CAMERO. Specifically, we share the weights of bottom layers across all models and apply different perturbations to the hidden representations for different models, which can effectively promote the model diversity. Our experiments using large language models demonstrate that CAMERO significantly improves the generalization performance of the ensemble model.
arXiv Detail & Related papers (2022-04-13T19:54:51Z)
Riemannian Score-Based Generative Modeling [56.20669989459281]
We introduce score-based generative models (SGMs) demonstrating remarkable empirical performance. Current SGMs make the underlying assumption that the data is supported on a Euclidean manifold with flat geometry. This prevents the use of these models for applications in robotics, geoscience or protein modeling.
arXiv Detail & Related papers (2022-02-06T11:57:39Z)
Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models. Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z)
What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression [68.82486784654817]
We study two popular model compression techniques including knowledge distillation and pruning. We show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets. We develop a regularization strategy for model compression based on sample uncertainty.
arXiv Detail & Related papers (2021-10-16T00:20:04Z)
Lossless Compression with Latent Variable Models [4.289574109162585]
We use latent variable models, which we call 'bits back with asymmetric numeral systems' (BB-ANS) The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data. We describe 'Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models.
arXiv Detail & Related papers (2021-04-21T14:03:05Z)
Scaling Hidden Markov Language Models [118.55908381553056]
This work revisits the challenge of scaling HMMs to language modeling datasets. We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization.
arXiv Detail & Related papers (2020-11-09T18:51:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.