Related papers: CAMEO: Autocorrelation-Preserving Line Simplification for Lossy Time Series Compression

CAMEO: Autocorrelation-Preserving Line Simplification for Lossy Time Series Compression

URL: http://arxiv.org/abs/2501.14432v1
Date: Fri, 24 Jan 2025 11:59:51 GMT
Title: CAMEO: Autocorrelation-Preserving Line Simplification for Lossy Time Series Compression
Authors: Carlos Enrique Muñiz-Cuza, Matthias Boehm, Torben Bach Pedersen,
Abstract summary: We propose a new lossy compression method that provides guarantees on the autocorrelation and partial-autocorrelation functions of a time series.<n>Our method improves compression ratios by 2x on average and up to 54x on selected datasets.
Score: 7.938342455750219
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series data from a variety of sensors and IoT devices need effective compression to reduce storage and I/O bandwidth requirements. While most time series databases and systems rely on lossless compression, lossy techniques offer even greater space-saving with a small loss in precision. However, the unknown impact on downstream analytics applications requires a semi-manual trial-and-error exploration. We initiate work on lossy compression that provides guarantees on complex statistical features (which are strongly correlated with the accuracy of the downstream analytics). Specifically, we propose a new lossy compression method that provides guarantees on the autocorrelation and partial-autocorrelation functions (ACF/PACF) of a time series. Our method leverages line simplification techniques as well as incremental maintenance of aggregates, blocking, and parallelization strategies for effective and efficient compression. The results show that our method improves compression ratios by 2x on average and up to 54x on selected datasets, compared to previous lossy and lossless compression methods. Moreover, we maintain -- and sometimes even improve -- the forecasting accuracy by preserving the autocorrelation properties of the time series. Our framework is extensible to multivariate time series and other statistical features of the time series.

Related papers

Predictability-Aware Compression and Decompression Framework for Multichannel Time Series Data [13.135119836937239]
We propose a predictability-aware compression-decompression framework to reduce runtime, lower communication cost, and maintain prediction accuracy across diverse predictors.<n>The proposed framework is both time-efficient and scalable under a large number of channels.
arXiv Detail & Related papers (2025-05-31T15:53:41Z)
Efficient Token Compression for Vision Transformer with Spatial Information Preserved [59.79302182800274]
Token compression is essential for reducing the computational and memory requirements of transformer models. We propose an efficient and hardware-compatible token compression method called Prune and Merge.
arXiv Detail & Related papers (2025-03-30T14:23:18Z)
Highly Efficient Direct Analytics on Semantic-aware Time Series Data Compression [15.122371541057339]
We propose a novel method for direct analytics on time series data compressed by the SHRINK compression algorithm. Our approach offers reliable, high-speed outlier detection analytics for diverse IoT applications.
arXiv Detail & Related papers (2025-03-17T14:58:22Z)
Learned Compression of Nonlinear Time Series With Random Access [2.564905016909138]
Time series play a crucial role in many fields, including finance, healthcare, industry, and environmental monitoring.<n>We introduce NeaTS, a randomly-accessible compression scheme that approximates the time series with a sequence of nonlinear functions.<n>Our experiments show that NeaTS improves the compression ratio of the state-of-the-art lossy compressors by up to 14%.
arXiv Detail & Related papers (2024-12-20T10:30:06Z)
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy [59.1298692559785]
Key-Value ( KV) cache is crucial component in serving transformer-based autoregressive large language models (LLMs) Existing approaches to mitigate this issue include: (1) efficient attention variants integrated in upcycling stages; (2) KV cache compression at test time; and (3) KV cache compression at test time. We propose a low-rank approximation of KV weight matrices, allowing plug-in integration with existing transformer-based LLMs without model retraining. Our method is designed to function without model tuning in upcycling stages or task-specific profiling in test stages.
arXiv Detail & Related papers (2024-10-04T03:10:53Z)
In-Context Former: Lightning-fast Compressing Context for Large Language Model [48.831304302467004]
In this paper, we propose a new approach to compress the long input contexts of Transformer-based large language models (LLMs) We use the cross-attention mechanism and a small number of learnable digest tokens to condense information from the contextual word embeddings. Experimental results indicate that our method requires only 1/32 of the floating-point operations of the baseline during compression and improves processing speed by 68 to 112 times.
arXiv Detail & Related papers (2024-06-19T15:14:55Z)
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input. We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture. We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z)
Deep Dict: Deep Learning-based Lossy Time Series Compressor for IoT Data [15.97162100346596]
Deep Dict is a lossy time series compressor designed to achieve a high compression ratio while maintaining decompression error within a predefined range. BTAE extracts Bernoulli representations from time series data, reducing the size of the representations compared to conventional autoencoders. In order to address the limitations of common regression losses such as L1/L2, we introduce a novel loss function called quantized entropy loss (QEL)
arXiv Detail & Related papers (2024-01-18T22:10:21Z)
Lossy and Lossless (L$^2$) Post-training Model Size Compression [12.926354646945397]
We propose a post-training model size compression method that combines lossy and lossless compression in a unified way. Our method can achieve a stable $10times$ compression ratio without sacrificing accuracy and a $20times$ compression ratio with minor accuracy loss in a short time.
arXiv Detail & Related papers (2023-08-08T14:10:16Z)
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression [86.22294249097203]
We propose an ultrafast automated model compression framework called SeerNet for flexible network deployment. Our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.
arXiv Detail & Related papers (2023-04-13T10:52:49Z)
Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging. We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z)
Practical Network Acceleration with Tiny Sets [38.742142493108744]
Network compression is effective in accelerating the inference of deep neural networks. But it often requires finetuning with all the training data to recover from the accuracy loss. We propose a method named PRACTISE to accelerate the network with tiny sets of training images.
arXiv Detail & Related papers (2022-02-16T05:04:38Z)
Compressing gradients by exploiting temporal correlation in momentum-SGD [17.995905582226463]
We analyze compression methods that exploit temporal correlation in systems with and without error-feedback. Experiments with the ImageNet dataset demonstrate that our proposed methods offer significant reduction in the rate of communication. We prove the convergence of SGD under an expected error assumption by establishing a bound for the minimum gradient norm.
arXiv Detail & Related papers (2021-08-17T18:04:06Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models. We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.