Learned Compression of Nonlinear Time Series With Random Access
- URL: http://arxiv.org/abs/2412.16266v1
- Date: Fri, 20 Dec 2024 10:30:06 GMT
- Title: Learned Compression of Nonlinear Time Series With Random Access
- Authors: Andrea Guerra, Giorgio Vinciguerra, Antonio Boffa, Paolo Ferragina,
- Abstract summary: Time series play a crucial role in many fields, including finance, healthcare, industry, and environmental monitoring.
We introduce NeaTS, a randomly-accessible compression scheme that approximates the time series with a sequence of nonlinear functions.
Our experiments show that NeaTS improves the compression ratio of the state-of-the-art lossy compressors by up to 14%.
- Score: 2.564905016909138
- License:
- Abstract: Time series play a crucial role in many fields, including finance, healthcare, industry, and environmental monitoring. The storage and retrieval of time series can be challenging due to their unstoppable growth. In fact, these applications often sacrifice precious historical data to make room for new data. General-purpose compressors can mitigate this problem with their good compression ratios, but they lack efficient random access on compressed data, thus preventing real-time analyses. Ad-hoc streaming solutions, instead, typically optimise only for compression and decompression speed, while giving up compression effectiveness and random access functionality. Furthermore, all these methods lack awareness of certain special regularities of time series, whose trends over time can often be described by some linear and nonlinear functions. To address these issues, we introduce NeaTS, a randomly-accessible compression scheme that approximates the time series with a sequence of nonlinear functions of different kinds and shapes, carefully selected and placed by a partitioning algorithm to minimise the space. The approximation residuals are bounded, which allows storing them in little space and thus recovering the original data losslessly, or simply discarding them to obtain a lossy time series representation with maximum error guarantees. Our experiments show that NeaTS improves the compression ratio of the state-of-the-art lossy compressors that use linear or nonlinear functions (or both) by up to 14%. Compared to lossless compressors, NeaTS emerges as the only approach to date providing, simultaneously, compression ratios close to or better than the best existing compressors, a much faster decompression speed, and orders of magnitude more efficient random access, thus enabling the storage and real-time analysis of massive and ever-growing amounts of (historical) time series data.
Related papers
- CAMEO: Autocorrelation-Preserving Line Simplification for Lossy Time Series Compression [7.938342455750219]
We propose a new lossy compression method that provides guarantees on the autocorrelation and partial-autocorrelation functions of a time series.
Our method improves compression ratios by 2x on average and up to 54x on selected datasets.
arXiv Detail & Related papers (2025-01-24T11:59:51Z) - Fast Feedforward 3D Gaussian Splatting Compression [55.149325473447384]
3D Gaussian Splatting (FCGS) is an optimization-free model that can compress 3DGS representations rapidly in a single feed-forward pass.
FCGS achieves a compression ratio of over 20X while maintaining fidelity, surpassing most per-scene SOTA optimization-based methods.
arXiv Detail & Related papers (2024-10-10T15:13:08Z) - LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy [59.1298692559785]
Key-Value ( KV) cache is crucial component in serving transformer-based autoregressive large language models (LLMs)
Existing approaches to mitigate this issue include: (1) efficient attention variants integrated in upcycling stages; (2) KV cache compression at test time; and (3) KV cache compression at test time.
We propose a low-rank approximation of KV weight matrices, allowing plug-in integration with existing transformer-based LLMs without model retraining.
Our method is designed to function without model tuning in upcycling stages or task-specific profiling in test stages.
arXiv Detail & Related papers (2024-10-04T03:10:53Z) - Hyper-Compression: Model Compression via Hyperfunction [20.47369296713829]
We propose the so-called hyper-compression, inspired by the parsimonious relationship between genotype and phenotype.
It compresses LLaMA2-7B in an hour and achieves close-to-int4-quantization performance, without retraining.
Our work can facilitate the harmony between the scaling law and the stagnation of hardware upgradation.
arXiv Detail & Related papers (2024-09-01T02:57:41Z) - In-Context Former: Lightning-fast Compressing Context for Large Language Model [48.831304302467004]
In this paper, we propose a new approach to compress the long input contexts of Transformer-based large language models (LLMs)
We use the cross-attention mechanism and a small number of learnable digest tokens to condense information from the contextual word embeddings.
Experimental results indicate that our method requires only 1/32 of the floating-point operations of the baseline during compression and improves processing speed by 68 to 112 times.
arXiv Detail & Related papers (2024-06-19T15:14:55Z) - What Operations can be Performed Directly on Compressed Arrays, and with What Error? [1.3307486544794784]
We develop a lossy compressor that allows a dozen fairly fundamental operations directly on compressed data.
We evaluate it on three non-trivial applications, choosing different number systems for internal representation.
arXiv Detail & Related papers (2024-06-17T05:01:09Z) - Deep Dict: Deep Learning-based Lossy Time Series Compressor for IoT Data [15.97162100346596]
Deep Dict is a lossy time series compressor designed to achieve a high compression ratio while maintaining decompression error within a predefined range.
BTAE extracts Bernoulli representations from time series data, reducing the size of the representations compared to conventional autoencoders.
In order to address the limitations of common regression losses such as L1/L2, we introduce a novel loss function called quantized entropy loss (QEL)
arXiv Detail & Related papers (2024-01-18T22:10:21Z) - DiffRate : Differentiable Compression Rate for Efficient Vision
Transformers [98.33906104846386]
Token compression aims to speed up large-scale vision transformers (e.g. ViTs) by pruning (dropping) or merging tokens.
DiffRate is a novel token compression method that has several appealing properties prior arts do not have.
arXiv Detail & Related papers (2023-05-29T10:15:19Z) - Latent Discretization for Continuous-time Sequence Compression [21.062288207034968]
In this work, we treat data sequences as observations from an underlying continuous-time process.
We show that our approaches can automatically achieve reductions in bit rates by learning how to discretize.
arXiv Detail & Related papers (2022-12-28T01:15:27Z) - Once-for-All Sequence Compression for Self-Supervised Speech Models [62.60723685118747]
We introduce a once-for-all sequence compression framework for self-supervised speech models.
The framework is evaluated on various tasks, showing marginal degradation compared to the fixed compressing rate variants.
We also explore adaptive compressing rate learning, demonstrating the ability to select task-specific preferred frame periods without needing a grid search.
arXiv Detail & Related papers (2022-11-04T09:19:13Z) - Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image
Compression [85.93207826513192]
We propose a unified and powerful deep lossy plus residual (DLPR) coding framework for both lossless and near-lossless image compression.
We solve the joint lossy and residual compression problem in the approach of VAEs.
In the near-lossless mode, we quantize the original residuals to satisfy a given $ell_infty$ error bound.
arXiv Detail & Related papers (2022-09-11T12:11:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.