Mlinear: Rethink the Linear Model for Time-series Forecasting
- URL: http://arxiv.org/abs/2305.04800v2
- Date: Thu, 3 Aug 2023 16:11:29 GMT
- Title: Mlinear: Rethink the Linear Model for Time-series Forecasting
- Authors: Wei Li, Xiangxu Meng, Chuhao Chen and Jianing Chen
- Abstract summary: Mlinear is a simple yet effective method based mainly on linear layers.
We introduce a new loss function that significantly outperforms the widely used mean squared error (MSE) on multiple datasets.
Our method significantly outperforms PatchTST with a ratio of 21:3 at 336 sequence length input and 29:10 at 512 sequence length input.
- Score: 9.841293660201261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, significant advancements have been made in time-series forecasting
research, with an increasing focus on analyzing the nature of time-series data,
e.g, channel-independence (CI) and channel-dependence (CD), rather than solely
focusing on designing sophisticated forecasting models. However, current
research has primarily focused on either CI or CD in isolation, and the
challenge of effectively combining these two opposing properties to achieve a
synergistic effect remains an unresolved issue. In this paper, we carefully
examine the opposing properties of CI and CD, and raise a practical question
that has not been effectively answered, e.g.,"How to effectively mix the CI and
CD properties of time series to achieve better predictive performance?" To
answer this question, we propose Mlinear (MIX-Linear), a simple yet effective
method based mainly on linear layers. The design philosophy of Mlinear mainly
includes two aspects:(1) dynamically tuning the CI and CD properties based on
the time semantics of different input time series, and (2) providing deep
supervision to adjust the individual performance of the "CI predictor" and "CD
predictor". In addition, empirically, we introduce a new loss function that
significantly outperforms the widely used mean squared error (MSE) on multiple
datasets. Experiments on time-series datasets covering multiple fields and
widely used have demonstrated the superiority of our method over PatchTST which
is the lateset Transformer-based method in terms of the MSE and MAE metrics on
7 datasets with identical sequence inputs (336 or 512). Specifically, our
method significantly outperforms PatchTST with a ratio of 21:3 at 336 sequence
length input and 29:10 at 512 sequence length input. Additionally, our approach
has a 10 $\times$ efficiency advantage at the unit level, taking into account
both training and inference times.
Related papers
- TS-TCD: Triplet-Level Cross-Modal Distillation for Time-Series Forecasting Using Large Language Models [15.266543423942617]
We present a novel framework, TS-TCD, which introduces a comprehensive three-tiered cross-modal knowledge distillation mechanism.
Unlike prior work that focuses on isolated alignment techniques, our framework systematically integrates.
Experiments on benchmark time-series demonstrate that TS-TCD achieves state-of-the-art results, outperforming traditional methods in both accuracy and robustness.
arXiv Detail & Related papers (2024-09-23T12:57:24Z) - Test Time Learning for Time Series Forecasting [1.4605709124065924]
Test-Time Training (TTT) modules consistently outperform state-of-the-art models, including the Mamba-based TimeMachine.
Our results show significant improvements in Mean Squared Error (MSE) and Mean Absolute Error (MAE)
This work sets a new benchmark for time-series forecasting and lays the groundwork for future research in scalable, high-performance forecasting models.
arXiv Detail & Related papers (2024-09-21T04:40:08Z) - MGCP: A Multi-Grained Correlation based Prediction Network for Multivariate Time Series [54.91026286579748]
We propose a Multi-Grained Correlations-based Prediction Network.
It simultaneously considers correlations at three levels to enhance prediction performance.
It employs adversarial training with an attention mechanism-based predictor and conditional discriminator to optimize prediction results at coarse-grained level.
arXiv Detail & Related papers (2024-05-30T03:32:44Z) - TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling [67.02157180089573]
Time series pre-training has recently garnered wide attention for its potential to reduce labeling expenses and benefit various downstream tasks.
This paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks.
arXiv Detail & Related papers (2024-02-04T13:10:51Z) - MPR-Net:Multi-Scale Pattern Reproduction Guided Universality Time Series
Interpretable Forecasting [13.790498420659636]
Time series forecasting has received wide interest from existing research due to its broad applications inherent challenging.
This paper proposes a forecasting model, MPR-Net. It first adaptively decomposes multi-scale historical series patterns using convolution operation, then constructs a pattern extension forecasting method based on the prior knowledge of pattern reproduction, and finally reconstructs future patterns into future series using deconvolution operation.
By leveraging the temporal dependencies present in the time series, MPR-Net not only achieves linear time complexity, but also makes the forecasting process interpretable.
arXiv Detail & Related papers (2023-07-13T13:16:01Z) - The Capacity and Robustness Trade-off: Revisiting the Channel
Independent Strategy for Multivariate Time Series Forecasting [50.48888534815361]
We show that models trained with the Channel Independent (CI) strategy outperform those trained with the Channel Dependent (CD) strategy.
Our results conclude that the CD approach has higher capacity but often lacks robustness to accurately predict distributionally drifted time series.
We propose a modified CD method called Predict Residuals with Regularization (PRReg) that can surpass the CI strategy.
arXiv Detail & Related papers (2023-04-11T13:15:33Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Generative Time Series Forecasting with Diffusion, Denoise, and
Disentanglement [51.55157852647306]
Time series forecasting has been a widely explored task of great importance in many applications.
It is common that real-world time series data are recorded in a short time period, which results in a big gap between the deep model and the limited and noisy time series.
We propose to address the time series forecasting problem with generative modeling and propose a bidirectional variational auto-encoder equipped with diffusion, denoise, and disentanglement.
arXiv Detail & Related papers (2023-01-08T12:20:46Z) - Enhancing Transformer Efficiency for Multivariate Time Series
Classification [12.128991867050487]
We propose a methodology to investigate the relationship between model efficiency and accuracy, as well as its complexity.
Comprehensive experiments on benchmark MTS datasets illustrate the effectiveness of our method.
arXiv Detail & Related papers (2022-03-28T03:25:19Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Time Series Forecasting with Ensembled Stochastic Differential Equations
Driven by L\'evy Noise [2.3076895420652965]
We use a collection of SDEs equipped with neural networks to predict long-term trend of noisy time series.
Our contributions are, first, we use the phase space reconstruction method to extract intrinsic dimension of the time series data.
Second, we explore SDEs driven by $alpha$-stable L'evy motion to model the time series data and solve the problem through neural network approximation.
arXiv Detail & Related papers (2021-11-25T16:49:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.