Addressing Distribution Shift in Time Series Forecasting with Instance
Normalization Flows
- URL: http://arxiv.org/abs/2401.16777v1
- Date: Tue, 30 Jan 2024 06:35:52 GMT
- Title: Addressing Distribution Shift in Time Series Forecasting with Instance
Normalization Flows
- Authors: Wei Fan, Shun Zheng, Pengyang Wang, Rui Xie, Jiang Bian, Yanjie Fu
- Abstract summary: We propose a general decoupled formulation for time series forecasting.
We make such a formulation formalized into a bi-level optimization problem.
Our method consistently outperforms state-of-the-art baselines on both synthetic and real-world data.
- Score: 36.956983415564274
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to non-stationarity of time series, the distribution shift problem
largely hinders the performance of time series forecasting. Existing solutions
either fail for the shifts beyond simple statistics or the limited
compatibility with forecasting models. In this paper, we propose a general
decoupled formulation for time series forecasting, with no reliance on fixed
statistics and no restriction on forecasting architectures. Then, we make such
a formulation formalized into a bi-level optimization problem, to enable the
joint learning of the transformation (outer loop) and forecasting (inner loop).
Moreover, the special requirements of expressiveness and bi-direction for the
transformation motivate us to propose instance normalization flows (IN-Flow), a
novel invertible network for time series transformation. Extensive experiments
demonstrate our method consistently outperforms state-of-the-art baselines on
both synthetic and real-world data.
Related papers
- Bridging Distribution Gaps in Time Series Foundation Model Pretraining with Prototype-Guided Normalization [29.082583523943157]
We propose a domain-aware adaptive normalization strategy within the Transformer architecture.
We replace the traditional LayerNorm with a prototype-guided dynamic normalization mechanism (ProtoNorm)
Our method significantly outperforms conventional pretraining techniques across both classification and forecasting tasks.
arXiv Detail & Related papers (2025-04-15T06:23:00Z) - Flow-based Conformal Prediction for Multi-dimensional Time Series [9.900139803164372]
We propose a novel conformal prediction method to address two key challenges by integrating Transformer and Normalizing Flow.
The Transformer encodes the historical context of time series, and normalizing flow learns the transformation from the base distribution to the distribution of non-conformity scores conditioned on the encoded historical context.
We demonstrate that our proposed method achieves smaller prediction regions compared to the baselines while satisfying the desired coverage through comprehensive experiments using simulated and real-world time series datasets.
arXiv Detail & Related papers (2025-02-08T22:04:05Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - Frequency Adaptive Normalization For Non-stationary Time Series Forecasting [7.881136718623066]
Time series forecasting needs to address non-stationary data with evolving trend and seasonal patterns.
To address the non-stationarity, instance normalization has been recently proposed to alleviate impacts from the trend with certain statistical measures.
This paper proposes a new instance normalization solution, called frequency adaptive normalization (FAN), which extends instance normalization in handling both dynamic trend and seasonal patterns.
arXiv Detail & Related papers (2024-09-30T15:07:16Z) - Evolving Multi-Scale Normalization for Time Series Forecasting under Distribution Shifts [20.02869280775877]
We propose a novel model-agnostic Evolving Multi-Scale Normalization (EvoMSN) framework to tackle the distribution shift problem.
We evaluate the effectiveness of EvoMSN in improving the performance of five mainstream forecasting methods on benchmark datasets.
arXiv Detail & Related papers (2024-09-29T14:26:22Z) - Robust Multivariate Time Series Forecasting against Intra- and Inter-Series Transitional Shift [40.734564394464556]
We present a unified Probabilistic Graphical Model to Jointly capturing intra-/inter-series correlations and modeling the time-variant transitional distribution.
We validate the effectiveness and efficiency of JointPGM through extensive experiments on six highly non-stationary MTS datasets.
arXiv Detail & Related papers (2024-07-18T06:16:03Z) - Marginalization Consistent Mixture of Separable Flows for Probabilistic Irregular Time Series Forecasting [4.714246221974192]
We develop a novel probabilistic irregular time series forecasting model, Marginalization Consistent Mixtures of Separable Flows (moses)
moses outperforms other state-of-the-art marginalization consistent models, performs on par with ProFITi, but different from ProFITi, guarantee marginalization consistency.
arXiv Detail & Related papers (2024-06-11T13:28:43Z) - Unified Training of Universal Time Series Forecasting Transformers [104.56318980466742]
We present a Masked-based Universal Time Series Forecasting Transformer (Moirai)
Moirai is trained on our newly introduced Large-scale Open Time Series Archive (LOTSA) featuring over 27B observations across nine domains.
Moirai achieves competitive or superior performance as a zero-shot forecaster when compared to full-shot models.
arXiv Detail & Related papers (2024-02-04T20:00:45Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - A novel decomposed-ensemble time series forecasting framework: capturing
underlying volatility information [6.590038231008498]
We propose a novel time series forecasting paradigm that integrates decomposition with the capability to capture the underlying fluctuation information of the series.
Both the numerical data and the volatility information for each sub-mode are harnessed to train a neural network.
This network is adept at predicting the information of the sub-modes, and we aggregate the predictions of all sub-modes to generate the final output.
arXiv Detail & Related papers (2023-10-13T01:50:43Z) - TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series [57.4208255711412]
Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS)
We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks.
arXiv Detail & Related papers (2023-10-02T16:45:19Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - End-to-End Modeling Hierarchical Time Series Using Autoregressive
Transformer and Conditional Normalizing Flow based Reconciliation [13.447952588934337]
We propose a novel end-to-end hierarchical time series forecasting model, based on conditioned normalizing flow-based autoregressive transformer reconciliation.
Unlike other state-of-the-art methods, we achieve the forecasting and reconciliation simultaneously without requiring any explicit post-processing step.
arXiv Detail & Related papers (2022-12-28T05:43:57Z) - Non-stationary Transformers: Exploring the Stationarity in Time Series
Forecasting [86.33543833145457]
We propose Non-stationary Transformers as a generic framework with two interdependent modules: Series Stationarization and De-stationary Attention.
Our framework consistently boosts mainstream Transformers by a large margin, which reduces MSE by 49.43% on Transformer, 47.34% on Informer, and 46.89% on Reformer.
arXiv Detail & Related papers (2022-05-28T12:27:27Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Causally-motivated Shortcut Removal Using Auxiliary Labels [63.686580185674195]
Key challenge to learning such risk-invariant predictors is shortcut learning.
We propose a flexible, causally-motivated approach to address this challenge.
We show both theoretically and empirically that this causally-motivated regularization scheme yields robust predictors.
arXiv Detail & Related papers (2021-05-13T16:58:45Z) - Generalized Entropy Regularization or: There's Nothing Special about
Label Smoothing [83.78668073898001]
We introduce a family of entropy regularizers, which includes label smoothing as a special case.
We find that variance in model performance can be explained largely by the resulting entropy of the model.
We advise the use of other entropy regularization methods in its place.
arXiv Detail & Related papers (2020-05-02T12:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.