The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)
- URL: http://arxiv.org/abs/2510.22207v1
- Date: Sat, 25 Oct 2025 08:18:31 GMT
- Title: The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)
- Authors: Nnamdi Aghanya, Jun Li, Kewei Wang,
- Abstract summary: This paper introduces Error-Bounded Predictive Coding ( EPC), a lossy text that leverages a Masked Language Model (MLM) as a decompressor.<n>Instead of storing a subset of original tokens, EPC allows the model to predict masked content and stores minimal, rank-based corrections only when the model's top prediction is incorrect.<n>We demonstrate that EPC consistently dominates Predictive Masking, offering superior fidelity at a significantly lower bit rate by more efficiently utilising the model's intrinsic knowledge.
- Score: 6.453417258264177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) can achieve near-optimal lossless compression by acting as powerful probability models. We investigate their use in the lossy domain, where reconstruction fidelity is traded for higher compression ratios. This paper introduces Error-Bounded Predictive Coding (EPC), a lossy text codec that leverages a Masked Language Model (MLM) as a decompressor. Instead of storing a subset of original tokens, EPC allows the model to predict masked content and stores minimal, rank-based corrections only when the model's top prediction is incorrect. This creates a residual channel that offers continuous rate-distortion control. We compare EPC to a simpler Predictive Masking (PM) baseline and a transform-based Vector Quantisation with a Residual Patch (VQ+RE) approach. Through an evaluation that includes precise bit accounting and rate-distortion analysis, we demonstrate that EPC consistently dominates PM, offering superior fidelity at a significantly lower bit rate by more efficiently utilising the model's intrinsic knowledge.
Related papers
- COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression [5.280540253822294]
Post-training compression of Transformer models commonly relies on truncated singular value decomposition (SVD)<n>We propose COMPOT, a training-free compression framework that uses a small calibration dataset to estimate a sparse weight factorization.<n> COMPOT consistently delivers a superior quality-compression trade-off over strong low-rank and sparse baselines.
arXiv Detail & Related papers (2026-02-16T21:31:34Z) - Don't be so Stief! Learning KV Cache low-rank approximation over the Stiefel manifold [7.162701793686856]
StiefAttention is a KV-cache compression method that learns emphorthonormal projection bases by directly minimizing output reconstruction error.<n>It outperforms EigenAttention by $11.9$ points on C4 perplexity and $5.4%$ on 0-shot MMLU accuracy at iso-compression, lower relative error and higher cosine similarity with respect to the original decoder-layer outputs.
arXiv Detail & Related papers (2026-01-29T13:19:24Z) - A Model-Driven Lossless Compression Algorithm Resistant to Mismatch [2.7930955543692817]
We propose a new compression algorithm based on next-token prediction that is robust to arbitrarily large, but structured, prediction mismatches.<n>Our results demonstrate reliable operation within the certified mismatch regime while achieving compression ratios that exceed those of commonly used compression methods.
arXiv Detail & Related papers (2026-01-25T04:07:21Z) - Knowledge-Informed Neural Network for Complex-Valued SAR Image Recognition [51.03674130115878]
We introduce the Knowledge-Informed Neural Network (KINN), a lightweight framework built upon a novel "compression-aggregation-compression" architecture.<n>KINN establishes a state-of-the-art in parameter-efficient recognition, offering exceptional generalization in data-scarce and out-of-distribution scenarios.
arXiv Detail & Related papers (2025-10-23T07:12:26Z) - Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM [11.762499172999886]
Large language models (LLM) and vision-language models (VLM) have achieved state-of-the-art performance, but they impose significant memory and computing challenges in deployment.<n>We present a novel low-rank compression framework to address this challenge.
arXiv Detail & Related papers (2025-10-07T03:07:47Z) - Accelerating Diffusion LLMs via Adaptive Parallel Decoding [50.9948753314669]
We introduce adaptive parallel decoding (APD), a novel method that dynamically adjusts the number of tokens sampled in parallel.<n>APD provides markedly higher throughput with minimal quality degradations on downstream benchmarks.
arXiv Detail & Related papers (2025-05-31T06:10:10Z) - Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning [1.749935196721634]
We prove that layerwise PC performs block-coordinate descent on the minimum description length objective.<n>We also prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds.<n>This is the first result offering formal generalization and convergence guarantees for PC-trained deep models.
arXiv Detail & Related papers (2025-05-20T17:25:16Z) - Denoising Diffusion Probabilistic Model for Point Cloud Compression at Low Bit-Rates [22.076896401919683]
This paper proposes a "Denoising Diffusion Probabilistic Model" architecture for point cloud compression.<n>A PointNet encoder produces the condition vector for the generation, which is then quantized via a learnable vector quantizer.<n> Experiments on ShapeNet and ModelNet40 show improved rate-distortion at low rates compared to standardized and state-of-the-art approaches.
arXiv Detail & Related papers (2025-05-19T16:29:12Z) - Choose Your Model Size: Any Compression of Large Language Models Without Re-Computation [10.376875638696504]
This work presents Any Compression via Iterative Pruning (ACIP), a novel algorithmic approach to determine a compression-performance trade-off.<n>We use an SVD-reparametrization of linear layers and iteratively prune their singular values with a sparsity-inducing penalty.<n>We show that ACIP seamlessly complements common quantization-based compression techniques.
arXiv Detail & Related papers (2025-02-03T18:40:58Z) - CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression.<n>We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations.<n>During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT)<n>RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z) - Autoregressive Speech Synthesis without Vector Quantization [135.4776759536272]
We present MELLE, a novel continuous-valued token based language modeling approach for text-to-speech synthesis (TTS)<n>MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition.<n>MELLE mitigates robustness issues by avoiding the inherent flaws of sampling vector-quantized codes.
arXiv Detail & Related papers (2024-07-11T14:36:53Z) - Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition [13.480231032159834]
We propose a novel approach for determining the optimal ranks of low-rank layers, ensuring that the gradient direction of the compressed model closely aligns with that of the original model.<n>This means that the compressed model effectively preserves the update direction of the full model, enabling more efficient compression for Pedestrian Attribute Recognition tasks.
arXiv Detail & Related papers (2023-06-16T13:07:13Z) - CrAM: A Compression-Aware Minimizer [103.29159003723815]
We propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way.
CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning.
CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware.
arXiv Detail & Related papers (2022-07-28T16:13:28Z) - Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images.
We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.