Related papers: The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)

The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)

URL: http://arxiv.org/abs/2510.22207v1
Date: Sat, 25 Oct 2025 08:18:31 GMT
Title: The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)
Authors: Nnamdi Aghanya, Jun Li, Kewei Wang,
Abstract summary: This paper introduces Error-Bounded Predictive Coding ( EPC), a lossy text that leverages a Masked Language Model (MLM) as a decompressor.<n>Instead of storing a subset of original tokens, EPC allows the model to predict masked content and stores minimal, rank-based corrections only when the model's top prediction is incorrect.<n>We demonstrate that EPC consistently dominates Predictive Masking, offering superior fidelity at a significantly lower bit rate by more efficiently utilising the model's intrinsic knowledge.
Score: 6.453417258264177
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) can achieve near-optimal lossless compression by acting as powerful probability models. We investigate their use in the lossy domain, where reconstruction fidelity is traded for higher compression ratios. This paper introduces Error-Bounded Predictive Coding (EPC), a lossy text codec that leverages a Masked Language Model (MLM) as a decompressor. Instead of storing a subset of original tokens, EPC allows the model to predict masked content and stores minimal, rank-based corrections only when the model's top prediction is incorrect. This creates a residual channel that offers continuous rate-distortion control. We compare EPC to a simpler Predictive Masking (PM) baseline and a transform-based Vector Quantisation with a Residual Patch (VQ+RE) approach. Through an evaluation that includes precise bit accounting and rate-distortion analysis, we demonstrate that EPC consistently dominates PM, offering superior fidelity at a significantly lower bit rate by more efficiently utilising the model's intrinsic knowledge.

Related papers

COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression [5.280540253822294]
Post-training compression of Transformer models commonly relies on truncated singular value decomposition (SVD)<n>We propose COMPOT, a training-free compression framework that uses a small calibration dataset to estimate a sparse weight factorization.<n> COMPOT consistently delivers a superior quality-compression trade-off over strong low-rank and sparse baselines.
arXiv Detail & Related papers (2026-02-16T21:31:34Z)
Don't be so Stief! Learning KV Cache low-rank approximation over the Stiefel manifold [7.162701793686856]
StiefAttention is a KV-cache compression method that learns emphorthonormal projection bases by directly minimizing output reconstruction error.<n>It outperforms EigenAttention by $11.9$ points on C4 perplexity and $5.4%$ on 0-shot MMLU accuracy at iso-compression, lower relative error and higher cosine similarity with respect to the original decoder-layer outputs.
arXiv Detail & Related papers (2026-01-29T13:19:24Z)
A Model-Driven Lossless Compression Algorithm Resistant to Mismatch [2.7930955543692817]
We propose a new compression algorithm based on next-token prediction that is robust to arbitrarily large, but structured, prediction mismatches.<n>Our results demonstrate reliable operation within the certified mismatch regime while achieving compression ratios that exceed those of commonly used compression methods.
arXiv Detail & Related papers (2026-01-25T04:07:21Z)
Knowledge-Informed Neural Network for Complex-Valued SAR Image Recognition [51.03674130115878]
We introduce the Knowledge-Informed Neural Network (KINN), a lightweight framework built upon a novel "compression-aggregation-compression" architecture.<n>KINN establishes a state-of-the-art in parameter-efficient recognition, offering exceptional generalization in data-scarce and out-of-distribution scenarios.
arXiv Detail & Related papers (2025-10-23T07:12:26Z)
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM [11.762499172999886]
Large language models (LLM) and vision-language models (VLM) have achieved state-of-the-art performance, but they impose significant memory and computing challenges in deployment.<n>We present a novel low-rank compression framework to address this challenge.
arXiv Detail & Related papers (2025-10-07T03:07:47Z)
Accelerating Diffusion LLMs via Adaptive Parallel Decoding [50.9948753314669]
We introduce adaptive parallel decoding (APD), a novel method that dynamically adjusts the number of tokens sampled in parallel.<n>APD provides markedly higher throughput with minimal quality degradations on downstream benchmarks.
arXiv Detail & Related papers (2025-05-31T06:10:10Z)
Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning [1.749935196721634]
We prove that layerwise PC performs block-coordinate descent on the minimum description length objective.<n>We also prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds.<n>This is the first result offering formal generalization and convergence guarantees for PC-trained deep models.
arXiv Detail & Related papers (2025-05-20T17:25:16Z)
Denoising Diffusion Probabilistic Model for Point Cloud Compression at Low Bit-Rates [22.076896401919683]
This paper proposes a "Denoising Diffusion Probabilistic Model" architecture for point cloud compression.<n>A PointNet encoder produces the condition vector for the generation, which is then quantized via a learnable vector quantizer.<n> Experiments on ShapeNet and ModelNet40 show improved rate-distortion at low rates compared to standardized and state-of-the-art approaches.
arXiv Detail & Related papers (2025-05-19T16:29:12Z)
Choose Your Model Size: Any Compression of Large Language Models Without Re-Computation [10.376875638696504]
This work presents Any Compression via Iterative Pruning (ACIP), a novel algorithmic approach to determine a compression-performance trade-off.<n>We use an SVD-reparametrization of linear layers and iteratively prune their singular values with a sparsity-inducing penalty.<n>We show that ACIP seamlessly complements common quantization-based compression techniques.
arXiv Detail & Related papers (2025-02-03T18:40:58Z)
CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression.<n>We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations.<n>During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT)<n>RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z)
Autoregressive Speech Synthesis without Vector Quantization [135.4776759536272]
We present MELLE, a novel continuous-valued token based language modeling approach for text-to-speech synthesis (TTS)<n>MELLE autoregressively generates continuous mel-spectrogram frames directly from text condition.<n>MELLE mitigates robustness issues by avoiding the inherent flaws of sampling vector-quantized codes.
arXiv Detail & Related papers (2024-07-11T14:36:53Z)
Lightweight Attribute Localizing Models for Pedestrian Attribute Recognition [13.480231032159834]
We propose a novel approach for determining the optimal ranks of low-rank layers, ensuring that the gradient direction of the compressed model closely aligns with that of the original model.<n>This means that the compressed model effectively preserves the update direction of the full model, enabling more efficient compression for Pedestrian Attribute Recognition tasks.
arXiv Detail & Related papers (2023-06-16T13:07:13Z)
CrAM: A Compression-Aware Minimizer [103.29159003723815]
We propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way. CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning. CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware.
arXiv Detail & Related papers (2022-07-28T16:13:28Z)
Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images. We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.