Distributed Lag Transformer based on Time-Variable-Aware Learning for Explainable Multivariate Time Series Forecasting
- URL: http://arxiv.org/abs/2408.16896v2
- Date: Wed, 13 Aug 2025 04:58:21 GMT
- Title: Distributed Lag Transformer based on Time-Variable-Aware Learning for Explainable Multivariate Time Series Forecasting
- Authors: Younghwi Kim, Dohee Kim, Joongrock Kim, Sunghyun Sim,
- Abstract summary: Distributed Lag Transformer (DLFormer) is a novel Transformer architecture for explainable and scalable time series models.<n>DLFormer integrates a distributed lag embedding and a time variable aware learning (TVAL) mechanism to structurally model both local and global temporal dependencies.<n>Experiments show that DLFormer achieves predictive accuracy while offering robust, interpretable insights into variable wise and temporal dynamics.
- Score: 4.572747329528556
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Time series data is a key element of big data analytics, commonly found in domains such as finance, healthcare, climate forecasting, and transportation. In large scale real world settings, such data is often high dimensional and multivariate, requiring advanced forecasting methods that are both accurate and interpretable. Although Transformer based models perform well in multivariate time series forecasting (MTSF), their lack of explainability limits their use in critical applications. To overcome this, we propose Distributed Lag Transformer (DLFormer), a novel Transformer architecture for explainable and scalable MTSF. DLFormer integrates a distributed lag embedding and a time variable aware learning (TVAL) mechanism to structurally model both local and global temporal dependencies and explicitly capture the influence of past variables on future outcomes. Experiments on ten benchmark and real world datasets show that DLFormer achieves state of the art predictive accuracy while offering robust, interpretable insights into variable wise and temporal dynamics. These results highlight ability of DLFormer to bridge the gap between performance and explainability, making it highly suitable for practical big data forecasting tasks.
Related papers
- DiTS: Multimodal Diffusion Transformers Are Time Series Forecasters [50.43534351968113]
Existing generative time series models do not address the multi-dimensional properties of time series data well.<n>Inspired by Multimodal Diffusion Transformers that integrate textual guidance into video generation, we propose Diffusion Transformers for Time Series (DiTS)
arXiv Detail & Related papers (2026-02-06T10:48:13Z) - Explainable time-series forecasting with sampling-free SHAP for Transformers [3.515829683606796]
We introduce SHAPformer, an accurate, fast and sampling-free explainable time-series forecasting model based on the Transformer architecture.<n>It generates explanations in under one second, several orders of magnitude faster than the SHAP Permutation Explainer.<n>On synthetic data with ground truth explanations, SHAPformer provides explanations that are true to the data.
arXiv Detail & Related papers (2025-12-23T17:02:35Z) - IndexNet: Timestamp and Variable-Aware Modeling for Time Series Forecasting [35.17464235813366]
IndexNet is a vectors-based augmented framework with an Index Embedding (IE) module.<n>IE transforms timestamps into embedding and injects them into the input sequence, thereby improving the model's ability to capture long-term complex periodic patterns.<n>In parallel, CE assigns each variable a unique and trainable identity embedding based on its index, allowing the model to explicitly distinguish between heterogeneous variables.
arXiv Detail & Related papers (2025-09-28T11:30:17Z) - In Shift and In Variance: Assessing the Robustness of HAR Deep Learning Models against Variability [4.330123738563178]
Human Activity Recognition (HAR) using wearable inertial measurement unit (IMU) sensors can revolutionize healthcare by enabling continual health monitoring, disease prediction, and routine recognition.
Despite the high accuracy of Deep Learning (DL) HAR models, their robustness to real-world variabilities remains untested.
We isolate subject, device, position, and orientation variability to determine their effect on DL HAR models and assess the robustness of these models in real-world conditions.
arXiv Detail & Related papers (2025-03-14T14:53:56Z) - Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting [50.298817606660826]
We introduce Powerformer, a novel Transformer variant that replaces noncausal attention weights with causal weights that are reweighted according to a smooth heavy-tailed decay.
Our empirical results demonstrate that Powerformer achieves state-of-the-art accuracy on public time-series benchmarks.
Our analyses show that the model's locality bias is amplified during training, demonstrating an interplay between time-series data and power-law-based attention.
arXiv Detail & Related papers (2025-02-10T04:42:11Z) - EDformer: Embedded Decomposition Transformer for Interpretable Multivariate Time Series Predictions [4.075971633195745]
This paper introduces an embedded transformer, 'EDformer', for time series forecasting tasks.<n>Without altering the fundamental elements, we reuse the Transformer architecture and consider the capable functions of its constituent parts.<n>The model obtains state-of-the-art predicting results in terms of accuracy and efficiency on complex real-world time series datasets.
arXiv Detail & Related papers (2024-12-16T11:13:57Z) - Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - ExoTST: Exogenous-Aware Temporal Sequence Transformer for Time Series Prediction [11.511830352094353]
We propose ExoTST, a transformer-based framework for time series prediction.
To integrate past and current variables, ExoTST introduces a novel cross-temporal modality fusion module.
Experiments on real-world carbon flux datasets and time series benchmarks demonstrate ExoTST's superior performance.
arXiv Detail & Related papers (2024-10-16T03:04:37Z) - Learning Graph Structures and Uncertainty for Accurate and Calibrated Time-series Forecasting [65.40983982856056]
We introduce STOIC, that leverages correlations between time-series to learn underlying structure between time-series and to provide well-calibrated and accurate forecasts.
Over a wide-range of benchmark datasets STOIC provides 16% more accurate and better-calibrated forecasts.
arXiv Detail & Related papers (2024-07-02T20:14:32Z) - Adapting to Length Shift: FlexiLength Network for Trajectory Prediction [53.637837706712794]
Trajectory prediction plays an important role in various applications, including autonomous driving, robotics, and scene understanding.
Existing approaches mainly focus on developing compact neural networks to increase prediction precision on public datasets, typically employing a standardized input duration.
We introduce a general and effective framework, the FlexiLength Network (FLN), to enhance the robustness of existing trajectory prediction against varying observation periods.
arXiv Detail & Related papers (2024-03-31T17:18:57Z) - TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables [75.83318701911274]
TimeXer ingests external information to enhance the forecasting of endogenous variables.
TimeXer achieves consistent state-of-the-art performance on twelve real-world forecasting benchmarks.
arXiv Detail & Related papers (2024-02-29T11:54:35Z) - Discovering Predictable Latent Factors for Time Series Forecasting [39.08011991308137]
We develop a novel framework for inferring the intrinsic latent factors implied by the observable time series.
We introduce three characteristics, i.e., predictability, sufficiency, and identifiability, and model these characteristics via the powerful deep latent dynamics models.
Empirical results on multiple real datasets show the efficiency of our method for different kinds of time series forecasting.
arXiv Detail & Related papers (2023-03-18T14:37:37Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Mitigating Data Redundancy to Revitalize Transformer-based Long-Term Time Series Forecasting System [46.39662315849883]
We introduce CLMFormer, a novel framework that mitigates redundancy through curriculum learning and a memory-driven decoder.<n>CLMFormer consistently improves Transformer-based models by up to 30%, demonstrating its effectiveness in long-horizon forecasting.
arXiv Detail & Related papers (2022-07-16T04:05:15Z) - DRAformer: Differentially Reconstructed Attention Transformer for
Time-Series Forecasting [7.805077630467324]
Time-series forecasting plays an important role in many real-world scenarios, such as equipment life cycle forecasting, weather forecasting, and traffic flow forecasting.
It can be observed from recent research that a variety of transformer-based models have shown remarkable results in time-series forecasting.
However, there are still some issues that limit the ability of transformer-based models on time-series forecasting tasks.
arXiv Detail & Related papers (2022-06-11T10:34:29Z) - Multi-scale Attention Flow for Probabilistic Time Series Forecasting [68.20798558048678]
We propose a novel non-autoregressive deep learning model, called Multi-scale Attention Normalizing Flow(MANF)
Our model avoids the influence of cumulative error and does not increase the time complexity.
Our model achieves state-of-the-art performance on many popular multivariate datasets.
arXiv Detail & Related papers (2022-05-16T07:53:42Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Model-Attentive Ensemble Learning for Sequence Modeling [86.4785354333566]
We present Model-Attentive Ensemble learning for Sequence modeling (MAES)
MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions.
We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.
arXiv Detail & Related papers (2021-02-23T05:23:35Z) - Spatiotemporal Attention for Multivariate Time Series Prediction and
Interpretation [17.568599402858037]
temporal attention mechanism (STAM) for simultaneous learning of the most important time steps and variables.
Results: STAM maintains state-of-the-art prediction accuracy while offering the benefit of accurate interpretability.
arXiv Detail & Related papers (2020-08-11T17:34:55Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z) - Multivariate Probabilistic Time Series Forecasting via Conditioned
Normalizing Flows [8.859284959951204]
Time series forecasting is fundamental to scientific and engineering problems.
Deep learning methods are well suited for this problem.
We show that it improves over the state-of-the-art for standard metrics on many real-world data sets.
arXiv Detail & Related papers (2020-02-14T16:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.