Related papers: Lossless Compression: A New Benchmark for Time Series Model Evaluation

Lossless Compression: A New Benchmark for Time Series Model Evaluation

URL: http://arxiv.org/abs/2509.21002v1
Date: Thu, 25 Sep 2025 10:52:48 GMT
Title: Lossless Compression: A New Benchmark for Time Series Model Evaluation
Authors: Meng Wan, Benxi Tian, Jue Wang, Cui Hui, Ningming Nie, Tiantian Liu, Zongguo Wang, Cao Rongqiang, Peng Shi, Yangang Wang,
Abstract summary: We introduce lossless compression as a new paradigm for evaluating time series models.<n>This perspective establishes a direct equivalence between optimal compression length and the negative log-likelihood.<n>We propose and open-source a comprehensive evaluation framework TSCom-Bench.
Score: 20.540426615530556
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The evaluation of time series models has traditionally focused on four canonical tasks: forecasting, imputation, anomaly detection, and classification. While these tasks have driven significant progress, they primarily assess task-specific performance and do not rigorously measure whether a model captures the full generative distribution of the data. We introduce lossless compression as a new paradigm for evaluating time series models, grounded in Shannon's source coding theorem. This perspective establishes a direct equivalence between optimal compression length and the negative log-likelihood, providing a strict and unified information-theoretic criterion for modeling capacity. Then We define a standardized evaluation protocol and metrics. We further propose and open-source a comprehensive evaluation framework TSCom-Bench, which enables the rapid adaptation of time series models as backbones for lossless compression. Experiments across diverse datasets on state-of-the-art models, including TimeXer, iTransformer, and PatchTST, demonstrate that compression reveals distributional weaknesses overlooked by classic benchmarks. These findings position lossless compression as a principled task that complements and extends existing evaluation for time series modeling.

Related papers

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods [54.4711434793961]
We show that simple image downsampling consistently outperforms many advanced compression methods across multiple widely used benchmarks.<n>Motivated by these findings, we introduce VTC-Bench, an evaluation framework that incorporates a data filtering mechanism to denoise existing benchmarks.
arXiv Detail & Related papers (2025-10-08T15:44:28Z)
ARIES: Relation Assessment and Model Recommendation for Deep Time Series Forecasting [54.57031153712623]
ARIES is a framework for assessing relation between time series properties and modeling strategies.<n>We propose the first deep forecasting model recommender, capable of providing interpretable suggestions for real-world time series.
arXiv Detail & Related papers (2025-09-07T13:57:14Z)
Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles [7.787518725874443]
Time series foundation models (TSFMs) have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation.<n>This paper investigates a suite of statistical and ensemble-based enhancement techniques to improve robustness and accuracy.
arXiv Detail & Related papers (2025-08-18T04:06:26Z)
Sundial: A Family of Highly Capable Time Series Foundation Models [64.6322079384575]
We introduce Sundial, a family of native, flexible, and scalable time series foundation models.<n>Our models are pre-trained without specifying any prior distribution and can generate multiple probable predictions.<n>Sundial achieves state-of-the-art results on both point and probabilistic forecasting benchmarks with a just-in-time inference speed.
arXiv Detail & Related papers (2025-02-02T14:52:50Z)
Towards Pattern-aware Data Augmentation for Temporal Knowledge Graph Completion [18.51546761241817]
We introduce Booster, the first data augmentation strategy for temporal knowledge graphs.<n>We propose a hierarchical scoring algorithm based on triadic closures within TKGs.<n>We also propose a two-stage training approach to identify samples that deviate from the model's preferred patterns.
arXiv Detail & Related papers (2024-12-31T03:47:19Z)
Recurrent Neural Goodness-of-Fit Test for Time Series [8.22915954499148]
Time series data are crucial across diverse domains such as finance and healthcare.<n>Traditional evaluation metrics fall short due to the temporal dependencies and potential high dimensionality of the features.<n>We propose the REcurrent NeurAL (RENAL) Goodness-of-Fit test, a novel and statistically rigorous framework for evaluating generative time series models.
arXiv Detail & Related papers (2024-10-17T19:32:25Z)
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models [52.454274602380124]
Diffusion models heavily depend on the time-step $t$ to achieve satisfactory multi-round denoising. We propose a Temporal Feature Maintenance Quantization (TFMQ) framework building upon a Temporal Information Block. Powered by the pioneering block design, we devise temporal information aware reconstruction (TIAR) and finite set calibration (FSC) to align the full-precision temporal features.
arXiv Detail & Related papers (2023-11-27T12:59:52Z)
OrionBench: Benchmarking Time Series Generative Models in the Service of the End-User [8.05635934199494]
OrionBench is a continuous benchmarking framework for unsupervised time series anomaly detection models. We show how to use OrionBench, and the performance of pipelines across 17 releases published over the course of four years.
arXiv Detail & Related papers (2023-10-26T19:43:16Z)
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data [78.21197488065177]
Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning. This paper proposes a new task-agnostic framework, textitSynBench, to measure the quality of pretrained representations using synthetic data.
arXiv Detail & Related papers (2022-10-06T15:25:00Z)
What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression [68.82486784654817]
We study two popular model compression techniques including knowledge distillation and pruning. We show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets. We develop a regularization strategy for model compression based on sample uncertainty.
arXiv Detail & Related papers (2021-10-16T00:20:04Z)
Model Compression for Dynamic Forecast Combination [9.281199058905017]
We show that compressing dynamic forecasting ensembles into an individual model leads to a comparable predictive performance. We also show that the compressed individual model with best average rank is a rule-based regression model.
arXiv Detail & Related papers (2021-04-05T09:55:35Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.