Explainable time-series forecasting with sampling-free SHAP for Transformers
- URL: http://arxiv.org/abs/2512.20514v1
- Date: Tue, 23 Dec 2025 17:02:35 GMT
- Title: Explainable time-series forecasting with sampling-free SHAP for Transformers
- Authors: Matthias Hertel, Sebastian Pütz, Ralf Mikut, Veit Hagenmeyer, Benjamin Schäfer,
- Abstract summary: We introduce SHAPformer, an accurate, fast and sampling-free explainable time-series forecasting model based on the Transformer architecture.<n>It generates explanations in under one second, several orders of magnitude faster than the SHAP Permutation Explainer.<n>On synthetic data with ground truth explanations, SHAPformer provides explanations that are true to the data.
- Score: 3.515829683606796
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time-series forecasts are essential for planning and decision-making in many domains. Explainability is key to building user trust and meeting transparency requirements. Shapley Additive Explanations (SHAP) is a popular explainable AI framework, but it lacks efficient implementations for time series and often assumes feature independence when sampling counterfactuals. We introduce SHAPformer, an accurate, fast and sampling-free explainable time-series forecasting model based on the Transformer architecture. It leverages attention manipulation to make predictions based on feature subsets. SHAPformer generates explanations in under one second, several orders of magnitude faster than the SHAP Permutation Explainer. On synthetic data with ground truth explanations, SHAPformer provides explanations that are true to the data. Applied to real-world electrical load data, it achieves competitive predictive performance and delivers meaningful local and global insights, such as identifying the past load as the key predictor and revealing a distinct model behavior during the Christmas period.
Related papers
- Fast Mining and Dynamic Time-to-Event Prediction over Multi-sensor Data Streams [15.63942084384363]
This work aims to continuously forecast the timing of future events by analyzing multi-sensor data streams.<n>A key characteristic of real-world data streams is their dynamic nature, where the underlying patterns evolve over time.<n>We present TimeCast, a dynamic prediction framework designed to adapt to these changes and provide accurate, real-time predictions of future event time.
arXiv Detail & Related papers (2026-01-08T09:05:57Z) - A Unified Frequency Domain Decomposition Framework for Interpretable and Robust Time Series Forecasting [81.73338008264115]
Current approaches for time series forecasting, whether in the time or frequency domain, predominantly use deep learning models based on linear layers or transformers.<n>We propose FIRE, a unified frequency domain decomposition framework that provides a mathematical abstraction for diverse types of time series.<n>Fire consistently outperforms state-of-the-art models on long-term forecasting benchmarks.
arXiv Detail & Related papers (2025-10-11T09:59:25Z) - ProtoTS: Learning Hierarchical Prototypes for Explainable Time Series Forecasting [25.219624871510376]
We propose ProtoTS, a novel interpretable forecasting framework that achieves both high accuracy and transparent decision-making.<n>ProtoTS computes instance-prototype similarity based on a denoised representation that preserves abundant heterogeneous information.<n> Experiments on multiple realistic benchmarks, including a newly released LOF dataset, show that ProtoTS not only exceeds existing methods in forecast accuracy but also delivers expert-steerable interpretations.
arXiv Detail & Related papers (2025-09-27T07:10:21Z) - Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting [50.298817606660826]
We introduce Powerformer, a novel Transformer variant that replaces noncausal attention weights with causal weights that are reweighted according to a smooth heavy-tailed decay.<n>Our empirical results demonstrate that Powerformer achieves state-of-the-art accuracy on public time-series benchmarks.<n>Our analyses show that the model's locality bias is amplified during training, demonstrating an interplay between time-series data and power-law-based attention.
arXiv Detail & Related papers (2025-02-10T04:42:11Z) - Sundial: A Family of Highly Capable Time Series Foundation Models [47.27032162475962]
We introduce Sundial, a family of native, flexible, and scalable time series foundation models.<n>Our models are pre-trained without specifying any prior distribution and can generate multiple probable predictions.<n>Sundial achieves state-of-the-art results on both point and probabilistic forecasting benchmarks with a just-in-time inference speed.
arXiv Detail & Related papers (2025-02-02T14:52:50Z) - Unifying Prediction and Explanation in Time-Series Transformers via Shapley-based Pretraining [0.0]
ShapTST is a framework that enables time-series transformers to efficiently generate Shapley-value-based explanations alongside predictions in a single forward pass.<n>Our framework unifies the explanation and prediction in training through a novel Shapley-based pre-training design.
arXiv Detail & Related papers (2025-01-25T04:24:35Z) - Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - Learning Pattern-Specific Experts for Time Series Forecasting Under Patch-level Distribution Shift [51.01356105618118]
Time series often exhibit complex non-uniform distribution with varying patterns across segments, such as season, operating condition, or semantic meaning.<n>Existing approaches, which typically train a single model to capture all these diverse patterns, often struggle with the pattern drifts between patches.<n>We propose TFPS, a novel architecture that leverages pattern-specific experts for more accurate and adaptable time series forecasting.
arXiv Detail & Related papers (2024-10-13T13:35:29Z) - Distributed Lag Transformer based on Time-Variable-Aware Learning for Explainable Multivariate Time Series Forecasting [4.572747329528556]
Distributed Lag Transformer (DLFormer) is a novel Transformer architecture for explainable and scalable time series models.<n>DLFormer integrates a distributed lag embedding and a time variable aware learning (TVAL) mechanism to structurally model both local and global temporal dependencies.<n>Experiments show that DLFormer achieves predictive accuracy while offering robust, interpretable insights into variable wise and temporal dynamics.
arXiv Detail & Related papers (2024-08-29T20:39:54Z) - DAM: Towards A Foundation Model for Time Series Forecasting [0.8231118867997028]
We propose a neural model that takes randomly sampled histories and outputs an adjustable basis composition as a continuous function of time.
It involves three key components: (1) a flexible approach for using randomly sampled histories from a long-tail distribution; (2) a transformer backbone that is trained on these actively sampled histories to produce, as representational output; and (3) the basis coefficients of a continuous function of time.
arXiv Detail & Related papers (2024-07-25T08:48:07Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies.
THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin.
We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.