Transformer Multivariate Forecasting: Less is More?
- URL: http://arxiv.org/abs/2401.00230v2
- Date: Thu, 7 Mar 2024 10:25:20 GMT
- Title: Transformer Multivariate Forecasting: Less is More?
- Authors: Jingjing Xu, Caesar Wu, Yuan-Fang Li, Pascal Bouvry
- Abstract summary: The paper focuses on reducing redundant information to elevate forecasting accuracy while optimizing runtime efficiency.
The framework is evaluated by five state-of-the-art (SOTA) models and four diverse real-world datasets.
From the model perspective, one of the PCA-enhanced models: PCA+Crossformer, reduces mean square errors (MSE) by 33.3% and decreases runtime by 49.2% on average.
- Score: 42.558736426375056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the domain of multivariate forecasting, transformer models stand out as
powerful apparatus, displaying exceptional capabilities in handling messy
datasets from real-world contexts. However, the inherent complexity of these
datasets, characterized by numerous variables and lengthy temporal sequences,
poses challenges, including increased noise and extended model runtime. This
paper focuses on reducing redundant information to elevate forecasting accuracy
while optimizing runtime efficiency. We propose a novel transformer forecasting
framework enhanced by Principal Component Analysis (PCA) to tackle this
challenge. The framework is evaluated by five state-of-the-art (SOTA) models
and four diverse real-world datasets. Our experimental results demonstrate the
framework's ability to minimize prediction errors across all models and
datasets while significantly reducing runtime. From the model perspective, one
of the PCA-enhanced models: PCA+Crossformer, reduces mean square errors (MSE)
by 33.3% and decreases runtime by 49.2% on average. From the dataset
perspective, the framework delivers 14.3% MSE and 76.6% runtime reduction on
Electricity datasets, as well as 4.8% MSE and 86.9% runtime reduction on
Traffic datasets. This study aims to advance various SOTA models and enhance
transformer-based time series forecasting for intricate data. Code is available
at: https://github.com/jingjing-unilu/PCA_Transformer.
Related papers
- Less is more: Embracing sparsity and interpolation with Esiformer for time series forecasting [19.8447763392479]
Time series data generated from real-world applications always exhibits high variance and lots of noise.
We propose the Esiformer, which apply on the original data, decreasing the overall variance of the data and alleviating the influence of noise.
Our method outperforms leading model PatchTST, reducing MSE by 6.5% and MAE by 5.8%.
arXiv Detail & Related papers (2024-10-08T06:45:47Z) - Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Are Self-Attentions Effective for Time Series Forecasting? [4.990206466948269]
Time series forecasting is crucial for applications across multiple domains and various scenarios.
Recent findings have indicated that simpler linear models might outperform complex Transformer-based approaches.
We introduce a new architecture, Cross-Attention-only Time Series transformer (CATS)
Our model achieves superior performance with the lowest mean squared error and uses fewer parameters compared to existing models.
arXiv Detail & Related papers (2024-05-27T06:49:39Z) - Attention as Robust Representation for Time Series Forecasting [23.292260325891032]
Time series forecasting is essential for many practical applications.
Transformers' key feature, the attention mechanism, dynamically fusing embeddings to enhance data representation, often relegating attention weights to a byproduct role.
Our approach elevates attention weights as the primary representation for time series, capitalizing on the temporal relationships among data points to improve forecasting accuracy.
arXiv Detail & Related papers (2024-02-08T03:00:50Z) - A Transformer-based Framework For Multi-variate Time Series: A Remaining
Useful Life Prediction Use Case [4.0466311968093365]
This work proposed an encoder-transformer architecture-based framework for time series prediction.
We validated the effectiveness of the proposed framework on all four sets of the C-MAPPS benchmark dataset.
To enable the model awareness of the initial stages of the machine life and its degradation path, a novel expanding window method was proposed.
arXiv Detail & Related papers (2023-08-19T02:30:35Z) - A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance.
We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset.
We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Towards Data-Efficient Detection Transformers [77.43470797296906]
We show most detection transformers suffer from significant performance drops on small-size datasets.
We empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR.
We introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency.
arXiv Detail & Related papers (2022-03-17T17:56:34Z) - Towards physically consistent data-driven weather forecasting:
Integrating data assimilation with equivariance-preserving deep spatial
transformers [2.7998963147546148]
We propose 3 components to integrate with commonly used data-driven weather prediction models.
These components are 1) a deep spatial transformer added to latent space of U-NETs to preserve equivariance, 2) a data-assimilation algorithm to ingest noisy observations and improve the initial conditions for next forecasts, and 3) a multi-time-step algorithm, improving the accuracy of forecasts at short intervals.
arXiv Detail & Related papers (2021-03-16T23:15:00Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.