The Effectiveness of Discretization in Forecasting: An Empirical Study
on Neural Time Series Models
- URL: http://arxiv.org/abs/2005.10111v1
- Date: Wed, 20 May 2020 15:09:28 GMT
- Title: The Effectiveness of Discretization in Forecasting: An Empirical Study
on Neural Time Series Models
- Authors: Stephan Rabanser, Tim Januschowski, Valentin Flunkert, David Salinas,
Jan Gasthaus
- Abstract summary: We investigate the effect of data input and output transformations on the predictive performance of neural forecasting architectures.
We find that binning almost always improves performance compared to using normalized real-valued inputs.
- Score: 15.281725756608981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Time series modeling techniques based on deep learning have seen many
advancements in recent years, especially in data-abundant settings and with the
central aim of learning global models that can extract patterns across multiple
time series. While the crucial importance of appropriate data pre-processing
and scaling has often been noted in prior work, most studies focus on improving
model architectures. In this paper we empirically investigate the effect of
data input and output transformations on the predictive performance of several
neural forecasting architectures. In particular, we investigate the
effectiveness of several forms of data binning, i.e. converting real-valued
time series into categorical ones, when combined with feed-forward, recurrent
neural networks, and convolution-based sequence models. In many non-forecasting
applications where these models have been very successful, the model inputs and
outputs are categorical (e.g. words from a fixed vocabulary in natural language
processing applications or quantized pixel color intensities in computer
vision). For forecasting applications, where the time series are typically
real-valued, various ad-hoc data transformations have been proposed, but have
not been systematically compared. To remedy this, we evaluate the forecasting
accuracy of instances of the aforementioned model classes when combined with
different types of data scaling and binning. We find that binning almost always
improves performance (compared to using normalized real-valued inputs), but
that the particular type of binning chosen is of lesser importance.
Related papers
- VSFormer: Value and Shape-Aware Transformer with Prior-Enhanced Self-Attention for Multivariate Time Series Classification [47.92529531621406]
We propose a novel method, VSFormer, that incorporates both discriminative patterns (shape) and numerical information (value)
In addition, we extract class-specific prior information derived from supervised information to enrich the positional encoding.
Extensive experiments on all 30 UEA archived datasets demonstrate the superior performance of our method compared to SOTA models.
arXiv Detail & Related papers (2024-12-21T07:31:22Z) - Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization [74.3339999119713]
We develop a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies.
Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon.
arXiv Detail & Related papers (2024-12-06T18:22:59Z) - TimeSieve: Extracting Temporal Dynamics through Information Bottlenecks [31.10683149519954]
We propose an innovative time series forecasting model TimeSieve.
Our approach employs wavelet transforms to preprocess time series data, effectively capturing multi-scale features.
Our results validate the effectiveness of our approach in addressing the key challenges in time series forecasting.
arXiv Detail & Related papers (2024-06-07T15:58:12Z) - Scaling Law for Time Series Forecasting [8.967263259533036]
Scaling law that rewards large datasets, complex models and enhanced data granularity has been observed in various fields of deep learning.
Yet, studies on time series forecasting have cast doubt on scaling behaviors of deep learning methods for time series forecasting.
We propose a theory for scaling law for time series forecasting that can explain these seemingly abnormal behaviors.
arXiv Detail & Related papers (2024-05-24T00:46:27Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Time Series is a Special Sequence: Forecasting with Sample Convolution
and Interaction [9.449017120452675]
Time series is a special type of sequence data, a set of observations collected at even intervals of time and ordered chronologically.
Existing deep learning techniques use generic sequence models for time series analysis, which ignore some of its unique properties.
We propose a novel neural network architecture and apply it for the time series forecasting problem, wherein we conduct sample convolution and interaction at multiple resolutions for temporal modeling.
arXiv Detail & Related papers (2021-06-17T08:15:04Z) - Improved Predictive Deep Temporal Neural Networks with Trend Filtering [22.352437268596674]
We propose a new prediction framework based on deep neural networks and a trend filtering.
We reveal that the predictive performance of deep temporal neural networks improves when the training data is temporally processed by a trend filtering.
arXiv Detail & Related papers (2020-10-16T08:29:36Z) - A Multi-Channel Neural Graphical Event Model with Negative Evidence [76.51278722190607]
Event datasets are sequences of events of various types occurring irregularly over the time-line.
We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions.
arXiv Detail & Related papers (2020-02-21T23:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.