Related papers: Noise or Signal? Deconstructing Contradictions and An Adaptive Remedy for Reversible Normalization in Time Series Forecasting

Noise or Signal? Deconstructing Contradictions and An Adaptive Remedy for Reversible Normalization in Time Series Forecasting

URL: http://arxiv.org/abs/2510.04667v1
Date: Mon, 06 Oct 2025 10:22:20 GMT
Title: Noise or Signal? Deconstructing Contradictions and An Adaptive Remedy for Reversible Normalization in Time Series Forecasting
Authors: Fanzhe Fu, Yang Yang,
Abstract summary: RevIN is a key technique enabling simple linear models to achieve state-of-the-art performance in time series forecasting.<n>This paper deconstructs the perplexing performance of various normalization strategies by identifying four underlying theoretical contradictions.
Score: 4.212879006865343
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reversible Instance Normalization (RevIN) is a key technique enabling simple linear models to achieve state-of-the-art performance in time series forecasting. While replacing its non-robust statistics with robust counterparts (termed R$^2$-IN) seems like a straightforward improvement, our findings reveal a far more complex reality. This paper deconstructs the perplexing performance of various normalization strategies by identifying four underlying theoretical contradictions. Our experiments provide two crucial findings: first, the standard RevIN catastrophically fails on datasets with extreme outliers, where its MSE surges by a staggering 683\%. Second, while the simple R$^2$-IN prevents this failure and unexpectedly emerges as the best overall performer, our adaptive model (A-IN), designed to test a diagnostics-driven heuristic, unexpectedly suffers a complete and systemic failure. This surprising outcome uncovers a critical, overlooked pitfall in time series analysis: the instability introduced by a simple or counter-intuitive heuristic can be more damaging than the statistical issues it aims to solve. The core contribution of this work is thus a new, cautionary paradigm for time series normalization: a shift from a blind search for complexity to a diagnostics-driven analysis that reveals not only the surprising power of simple baselines but also the perilous nature of naive adaptation.

Related papers

Observationally Informed Adaptive Causal Experimental Design [55.998153710215654]
We propose Active Residual Learning, a new paradigm that leverages the observational model as a foundational prior.<n>This approach shifts the experimental focus from learning target causal quantities from scratch to efficiently estimating the residuals required to correct observational bias.<n> Experiments on synthetic and semi-synthetic benchmarks demonstrate that R-Design significantly outperforms baselines.
arXiv Detail & Related papers (2026-03-04T06:52:37Z)
The Procrustean Bed of Time Series: The Optimization Bias of Point-wise Loss [53.542743390809356]
This paper aims to provide a first-principles analysis of the Expectation of Optimization Bias (EOB)<n>Our analysis reveals a fundamental paradigm paradox: the more deterministic and structured the time series, the more severe the bias by point-wise loss function.<n>We present a concrete solution that simultaneously achieves both principles via DFT or DWT.
arXiv Detail & Related papers (2025-12-21T06:08:22Z)
Labels Matter More Than Models: Quantifying the Benefit of Supervised Time Series Anomaly Detection [56.302586730134806]
Time series anomaly detection (TSAD) is a critical data mining task often constrained by label scarcity.<n>Current research predominantly focuses on Unsupervised Time-series Anomaly Detection.<n>This paper challenges the premise that architectural complexity is the optimal path for TSAD.
arXiv Detail & Related papers (2025-11-20T08:32:49Z)
RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting [13.117430904377905]
Time series forecasting relies on predicting future values from historical data.<n>MSE has two fundamental weaknesses: its point-wise error fails to capture temporal relationships, and it does not account for inherent noise in the data.<n>We introduce the Residual-Informed Loss (RI-Loss), a novel objective function based on the Hilbert-Schmidt Independence Criterion (HSIC)
arXiv Detail & Related papers (2025-11-13T09:36:00Z)
Revisiting Multivariate Time Series Forecasting with Missing Values [74.56971641937771]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z)
Beyond Model Ranking: Predictability-Aligned Evaluation for Time Series Forecasting [18.018179328110048]
We introduce a predictability-aligned diagnostic framework grounded in spectral coherence.<n>We provide the first systematic evidence of "predictability drift", demonstrating that a task's forecasting difficulty varies sharply over time.<n>Our evaluation reveals a key architectural trade-off: complex models are superior for low-predictability data, whereas linear models are highly effective on more predictable tasks.
arXiv Detail & Related papers (2025-09-27T02:56:06Z)
Towards Foundation Models for Zero-Shot Time Series Anomaly Detection: Leveraging Synthetic Data and Relative Context Discrepancy [33.68487894996624]
Time series anomaly detection (TSAD) is a critical task, but developing models that generalize to unseen data remains a major challenge.<n>We introduce textttTimeRCD, a novel foundation model for TSAD built upon a new pre-training paradigm: Relative Context Discrepancy (RCD)<n>We show that textttTimeRCD significantly outperforms existing general-purpose and anomaly-specific foundation models in zero-shot TSAD.
arXiv Detail & Related papers (2025-09-25T14:05:15Z)
Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning [53.25336975467293]
We present the first theoretical error decomposition analysis of methods such as perplexity and self-consistency.<n>Our analysis reveals a fundamental trade-off: perplexity methods suffer from substantial model error due to the absence of a proper consistency function.<n>We propose Reasoning-Pruning Perplexity Consistency (RPC), which integrates perplexity with self-consistency, and Reasoning Pruning, which eliminates low-probability reasoning paths.
arXiv Detail & Related papers (2025-02-01T18:09:49Z)
Benign Overfitting in Out-of-Distribution Generalization of Linear Models [19.203753135860016]
We take an initial step towards understanding benign overfitting in the Out-of-Distribution (OOD) regime.<n>We provide non-asymptotic guarantees proving that benign overfitting occurs in standard ridge regression.<n>We also present theoretical results for a more general family of target covariance matrix.
arXiv Detail & Related papers (2024-12-19T02:47:39Z)
High-dimensional logistic regression with missing data: Imputation, regularization, and universality [7.167672851569787]
We study high-dimensional, ridge-regularized logistic regression. We provide exact characterizations of both the prediction error and the estimation error.
arXiv Detail & Related papers (2024-10-01T21:41:21Z)
Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL. We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training. For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data. We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.