Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting
- URL: http://arxiv.org/abs/2411.11046v1
- Date: Sun, 17 Nov 2024 11:53:54 GMT
- Title: Knowledge-enhanced Transformer for Multivariate Long Sequence Time-series Forecasting
- Authors: Shubham Tanaji Kakde, Rony Mitra, Jasashwi Mandal, Manoj Kumar Tiwari,
- Abstract summary: We introduce a novel approach that encapsulates conceptual relationships among variables within a well-defined knowledge graph.
We investigate the influence of this integration into seminal architectures such as PatchTST, Autoformer, Informer, and Vanilla Transformer.
This enhancement empowers transformer-based architectures to address the inherent structural relation between variables.
- Score: 4.645182684813973
- License:
- Abstract: Multivariate Long Sequence Time-series Forecasting (LSTF) has been a critical task across various real-world applications. Recent advancements focus on the application of transformer architectures attributable to their ability to capture temporal patterns effectively over extended periods. However, these approaches often overlook the inherent relationships and interactions between the input variables that could be drawn from their characteristic properties. In this paper, we aim to bridge this gap by integrating information-rich Knowledge Graph Embeddings (KGE) with state-of-the-art transformer-based architectures. We introduce a novel approach that encapsulates conceptual relationships among variables within a well-defined knowledge graph, forming dynamic and learnable KGEs for seamless integration into the transformer architecture. We investigate the influence of this integration into seminal architectures such as PatchTST, Autoformer, Informer, and Vanilla Transformer. Furthermore, we thoroughly investigate the performance of these knowledge-enhanced architectures along with their original implementations for long forecasting horizons and demonstrate significant improvement in the benchmark results. This enhancement empowers transformer-based architectures to address the inherent structural relation between variables. Our knowledge-enhanced approach improves the accuracy of multivariate LSTF by capturing complex temporal and relational dynamics across multiple domains. To substantiate the validity of our model, we conduct comprehensive experiments using Weather and Electric Transformer Temperature (ETT) datasets.
Related papers
- TEAFormers: TEnsor-Augmented Transformers for Multi-Dimensional Time Series Forecasting [14.43696537295348]
Multi-dimensional time series data are increasingly prevalent in fields such as economics, finance, and climate science.
Traditional Transformer models, though adept with sequential data, do not effectively preserve these multi-dimensional structures.
We introduce the vectors-Augmented Transformer (TEAFormer), a novel method that incorporates tensor expansion and compression within the Transformer framework.
arXiv Detail & Related papers (2024-10-27T13:32:12Z) - What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis [8.008567379796666]
The Transformer architecture has inarguably revolutionized deep learning.
At its core, the attention block differs in form and functionality from most other architectural components in deep learning.
The root causes behind these outward manifestations, and the precise mechanisms that govern them, remain poorly understood.
arXiv Detail & Related papers (2024-10-14T18:15:02Z) - PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers [56.264673865476986]
This paper introduces Skip-Layer Attention (SLA) to enhance Transformer models.
SLA improves the model's ability to capture dependencies between high-level abstract features and low-level details.
Our implementation extends the Transformer's functionality by enabling queries in a given layer to interact with keys and values from both the current layer and one preceding layer.
arXiv Detail & Related papers (2024-06-17T07:24:38Z) - A Systematic Review for Transformer-based Long-term Series Forecasting [7.414422194379818]
Transformer architecture has proven to be the most successful solution to extract semantic correlations.
Various variants have enabled transformer architecture to handle long-term time series forecasting tasks.
arXiv Detail & Related papers (2023-10-31T06:37:51Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - Multi-Exit Vision Transformer for Dynamic Inference [88.17413955380262]
We propose seven different architectures for early exit branches that can be used for dynamic inference in Vision Transformer backbones.
We show that each one of our proposed architectures could prove useful in the trade-off between accuracy and speed.
arXiv Detail & Related papers (2021-06-29T09:01:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.