time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models
- URL: http://arxiv.org/abs/2509.05801v2
- Date: Sat, 04 Oct 2025 15:13:16 GMT
- Title: time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models
- Authors: Debdeep Sanyal, Aaryan Nagpal, Dhruv Kumar, Murari Mandal, Saurabh Deshpande,
- Abstract summary: We introduce transplantation activation, a causal intervention that manipulates hidden states by imposing the statistical moments of one event onto another.<n>We find that models encode a graded notion of event severity, with the latent vector norm directly correlating with the magnitude of systemic shocks.<n>Our findings provide evidence for a latent concept space that governs model predictions, shifting interpretability from post-hoc attribution to direct causal intervention.
- Score: 7.400285974510125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While transformer-based foundation models excel at forecasting routine patterns, two questions remain: do they internalize semantic concepts such as market regimes, or merely fit curves? And can their internal representations be leveraged to simulate rare, high-stakes events such as market crashes? To investigate this, we introduce activation transplantation, a causal intervention that manipulates hidden states by imposing the statistical moments of one event (e.g., a historical crash) onto another (e.g., a calm period) during the forward pass. This procedure deterministically steers forecasts: injecting crash semantics induces downturn predictions, while injecting calm semantics suppresses crashes and restores stability. Beyond binary control, we find that models encode a graded notion of event severity, with the latent vector norm directly correlating with the magnitude of systemic shocks. Validated across two architecturally distinct TSFMs, Toto (decoder only) and Chronos (encoder-decoder), our results demonstrate that steerable, semantically grounded representations are a robust property of large time series transformers. Our findings provide evidence for a latent concept space that governs model predictions, shifting interpretability from post-hoc attribution to direct causal intervention, and enabling semantic "what-if" analysis for strategic stress-testing.
Related papers
- Real-Time Proactive Anomaly Detection via Forward and Backward Forecast Modeling [0.0]
We introduce two proactive anomaly detection frameworks: the Forward Forecasting Model (FFM) and the Backward Reconstruction Model (BRM)<n>FFM forecasts future sequences to anticipate disruptions, while BRM reconstructs recent history from future context to uncover early precursors.<n>Our models support both continuous and discrete multivariate features, enabling robust performance in real-world settings.
arXiv Detail & Related papers (2026-02-12T03:57:41Z) - TRACE: Scalable Amortized Causal Discovery from Single Sequences via Autoregressive Density Estimation [14.409508347156397]
We study causal discovery from a single observed sequence of discrete events generated by a process.<n>We introduce TRACE, a scalable framework that repurposes autoregressive models as pretrained density estimators for conditional mutual information estimation.
arXiv Detail & Related papers (2026-02-01T10:18:27Z) - From Observations to States: Latent Time Series Forecasting [65.98504021691666]
We propose Latent Time Series Forecasting (LatentTSF), a novel paradigm that shifts TSF from observation regression to latent state prediction.<n>Specifically, LatentTSF employs an AutoEncoder to project observations at each time step into a higher-dimensional latent state space.<n>Our proposed latent objectives implicitly maximize mutual information between predicted latent states and ground-truth states and observations.
arXiv Detail & Related papers (2026-01-30T20:39:44Z) - EVEREST: An Evidential, Tail-Aware Transformer for Rare-Event Time-Series Forecasting [4.551615447454767]
EVEREST is a transformer-based architecture for probabilistic rare-event forecasting.<n>It delivers calibrated predictions and tail-aware risk estimation.<n>It is applicable to high-stakes domains such as industrial monitoring, weather, and satellite diagnostics.
arXiv Detail & Related papers (2026-01-26T23:15:20Z) - SynCast: Synergizing Contradictions in Precipitation Nowcasting via Diffusion Sequential Preference Optimization [62.958457694151384]
We introduce preference optimization into precipitation nowcasting for the first time, motivated by the success of reinforcement learning from human feedback in large language models.<n>In the first stage, the framework focuses on reducing FAR, training the model to effectively suppress false alarms.
arXiv Detail & Related papers (2025-10-22T16:11:22Z) - Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - Adaptive Conformal Prediction Intervals Over Trajectory Ensembles [50.31074512684758]
Future trajectories play an important role across domains such as autonomous driving, hurricane forecasting, and epidemic modeling.<n>We propose a unified framework based on conformal prediction that transforms sampled trajectories into calibrated prediction intervals with theoretical coverage guarantees.
arXiv Detail & Related papers (2025-08-18T21:14:07Z) - SurvSurf: a partially monotonic neural network for first-hitting time prediction of intermittently observed discrete and continuous sequential events [7.861592120016206]
We propose a neural-network based survival model (SurvSurf) specifically designed for direct and simultaneous probabilistic prediction of the first hitting time of sequential events from baseline.<n>SurvSurf is theoretically guaranteed to never violate the monotonic relationship between the cumulative incidence functions of sequential events, while allowing nonlinear influence from predictors.
arXiv Detail & Related papers (2025-04-07T12:24:59Z) - Lightweight Channel-wise Dynamic Fusion Model: Non-stationary Time Series Forecasting via Entropy Analysis [25.291749176117662]
We show that variance can be a valid and interpretable proxy for non-stationarity of time series.<n>We propose a novel lightweight textitChannel-wise textitDynamic textitFusion textitModel (textitCDFM)<n> Comprehensive experiments on seven time series datasets demonstrate the superiority and generalization capabilities of CDFM.
arXiv Detail & Related papers (2025-03-04T13:29:42Z) - Spatiotemporal Prediction of Secondary Crashes by Rebalancing Dynamic and Static Data with Generative Adversarial Networks [6.571659350175123]
Secondary crashes significantly exacerbate traffic congestion and increase the severity of incidents.<n>Existing methods fail to fully address the complexity of traffic crash data, particularly the coexistence of dynamic and static features.<n>This study proposes a hybrid model named VarFusiGAN-Transformer, aimed at improving the fidelity of secondary crash data generation.
arXiv Detail & Related papers (2025-01-17T08:56:49Z) - TrajGPT: Controlled Synthetic Trajectory Generation Using a Multitask Transformer-Based Spatiotemporal Model [5.5481213807584995]
We introduce TrajGPT, a transformer-based, multi-task, joint generative model to address these issues.
Taking inspiration from large language models, TrajGPT poses the problem of controlled trajectory generation as that of text infilling in natural language.
Our experiments on public and private datasets demonstrate that TrajGPT not only excels in controlled synthetic visit generation but also outperforms competing models in next-location prediction tasks.
arXiv Detail & Related papers (2024-11-07T02:47:50Z) - When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples.
A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction.
Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z) - Self-Interpretable Time Series Prediction with Counterfactual
Explanations [4.658166900129066]
Interpretable time series prediction is crucial for safety-critical areas such as healthcare and autonomous driving.
Most existing methods focus on interpreting predictions by assigning important scores to segments of time series.
We develop a self-interpretable model, dubbed Counterfactual Time Series (CounTS), which generates counterfactual and actionable explanations for time series predictions.
arXiv Detail & Related papers (2023-06-09T16:42:52Z) - Uncovering the Missing Pattern: Unified Framework Towards Trajectory
Imputation and Prediction [60.60223171143206]
Trajectory prediction is a crucial undertaking in understanding entity movement or human behavior from observed sequences.
Current methods often assume that the observed sequences are complete while ignoring the potential for missing values.
This paper presents a unified framework, the Graph-based Conditional Variational Recurrent Neural Network (GC-VRNN), which can perform trajectory imputation and prediction simultaneously.
arXiv Detail & Related papers (2023-03-28T14:27:27Z) - Tracking the risk of a deployed model and detecting harmful distribution
shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially.
We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z) - Stochastically forced ensemble dynamic mode decomposition for
forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system.
We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony.
Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.