Related papers: How Patterns Dictate Learnability in Sequential Data

How Patterns Dictate Learnability in Sequential Data

URL: http://arxiv.org/abs/2510.10744v1
Date: Sun, 12 Oct 2025 18:31:39 GMT
Title: How Patterns Dictate Learnability in Sequential Data
Authors: Mario Morawski, Anais Despres, Rémi Rehm,
Abstract summary: We introduce a framework based on predictive information, defined as the mutual information between the past and the future.<n>We show that the presence or absence of temporal patterns fundamentally constrains the learnability of sequential models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sequential data - ranging from financial time series to natural language - has driven the growing adoption of autoregressive models. However, these algorithms rely on the presence of underlying patterns in the data, and their identification often depends heavily on human expertise. Misinterpreting these patterns can lead to model misspecification, resulting in increased generalization error and degraded performance. The recently proposed evolving pattern (EvoRate) metric addresses this by using the mutual information between the next data point and its past to guide regression order estimation and feature selection. Building on this idea, we introduce a general framework based on predictive information, defined as the mutual information between the past and the future, $I(X_{past}; X_{future})$. This quantity naturally defines an information-theoretic learning curve, which quantifies the amount of predictive information available as the observation window grows. Using this formalism, we show that the presence or absence of temporal patterns fundamentally constrains the learnability of sequential models: even an optimal predictor cannot outperform the intrinsic information limit imposed by the data. We validate our framework through experiments on synthetic data, demonstrating its ability to assess model adequacy, quantify the inherent complexity of a dataset, and reveal interpretable structure in sequential data.

Related papers

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence [91.54446789584826]
Epiplexity is a formalization of information capturing what computationally bounded observers can learn from data.<n>We show how information can be created with computation, how it depends on the ordering of the data, and how likelihood modeling can produce more complex programs than present in the data generating process itself.
arXiv Detail & Related papers (2026-01-06T18:04:03Z)
Supervised learning pays attention [42.97070083645048]
In-context learning with attention enables large neural networks to make context-specific predictions by selectively focusing on relevant examples.<n>We show how to flexibly fit personalized models for each prediction point and (2) model retain simplicity and interpretability.<n>Our method fits a local model for each test observation by weighting the training data according to attention, a supervised similarity measure.
arXiv Detail & Related papers (2025-12-10T18:43:46Z)
Estimating Time Series Foundation Model Transferability via In-Context Learning [74.65355820906355]
Time series foundation models (TSFMs) offer strong zero-shot forecasting via large-scale pre-training.<n>Fine-tuning remains critical for boosting performance in domains with limited public data.<n>We introduce TimeTic, a transferability estimation framework that recasts model selection as an in-context-learning problem.
arXiv Detail & Related papers (2025-09-28T07:07:13Z)
Lost in Retraining: Roaming the Parameter Space of Exponential Families Under Closed-Loop Learning [0.0]
We study closed-loop learning for models that belong to exponential families.<n>We show that maximum likelihood of the parameters endows sufficient statistics with the martingale property.<n>We show that this outcome may be prevented if the data contains at least one data point generated from a ground truth model.
arXiv Detail & Related papers (2025-06-25T17:12:22Z)
Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data.<n>We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization [74.3339999119713]
We develop a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies.<n>Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon.
arXiv Detail & Related papers (2024-12-06T18:22:59Z)
Stochastic Diffusion: A Diffusion Based Model for Stochastic Time Series Forecasting [8.232475807691255]
We propose a novel Diffusion (StochDiff) model which learns data-driven prior knowledge at each time step.<n>The learnt prior knowledge helps the model to capture complex temporal dynamics and the inherent uncertainty of the data.
arXiv Detail & Related papers (2024-06-05T00:13:38Z)
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data [92.1255811066468]
We review the use of diffusion models in time series and S-temporal data, categorizing them by model, task type, data modality, and practical application domain. We categorize diffusion models into unconditioned and conditioned types discuss time series and S-temporal data separately. Our survey covers their application extensively in various fields including healthcare, recommendation, climate, energy, audio, and transportation.
arXiv Detail & Related papers (2024-04-29T17:19:40Z)
ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data. Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z)
Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test. We train a variational inference model to predict the causal structure from observational/interventional data. Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z)
Learning Consistent Deep Generative Models from Sparse Data via Prediction Constraints [16.48824312904122]
We develop a new framework for learning variational autoencoders and other deep generative models. We show that these two contributions -- prediction constraints and consistency constraints -- lead to promising image classification performance.
arXiv Detail & Related papers (2020-12-12T04:18:50Z)
Conditional Mutual information-based Contrastive Loss for Financial Time Series Forecasting [12.0855096102517]
We present a representation learning framework for financial time series forecasting. In this paper, we propose to first learn compact representations from time series data, then use the learned representations to train a simpler model for predicting time series movements.
arXiv Detail & Related papers (2020-02-18T15:24:33Z)
Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset. To estimate the model parameters, an Alternating Least Squares algorithm is developed. The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.