AltTS: A Dual-Path Framework with Alternating Optimization for Multivariate Time Series Forecasting
- URL: http://arxiv.org/abs/2602.11533v1
- Date: Thu, 12 Feb 2026 03:45:00 GMT
- Title: AltTS: A Dual-Path Framework with Alternating Optimization for Multivariate Time Series Forecasting
- Authors: Zhihang Yuan, Zhiyuan Liu, Mahesh K. Marina,
- Abstract summary: We propose ALTTS, a dual-path framework that explicitly decouples autoregression and cross-relation modeling.<n>We show that ALTTS consistently outperforms prior methods, with the most pronounced improvements on long-horizon forecasting.
- Score: 27.971282358985604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multivariate time series forecasting involves two qualitatively distinct factors: (i) stable within-series autoregressive (AR) dynamics, and (ii) intermittent cross-dimension interactions that can become spurious over long horizons. We argue that fitting a single model to capture both effects creates an optimization conflict: the high-variance updates needed for cross-dimension modeling can corrupt the gradients that support autoregression, resulting in brittle training and degraded long-horizon accuracy. To address this, we propose ALTTS, a dual-path framework that explicitly decouples autoregression and cross-relation (CR) modeling. In ALTTS, the AR path is instantiated with a linear predictor, while the CR path uses a Transformer equipped with Cross-Relation Self-Attention (CRSA); the two branches are coordinated via alternating optimization to isolate gradient noise and reduce cross-block interference. Extensive experiments on multiple benchmarks show that ALTTS consistently outperforms prior methods, with the most pronounced improvements on long-horizon forecasting. Overall, our results suggest that carefully designed optimization strategies, rather than ever more complex architectures, can be a key driver of progress in multivariate time series forecasting.
Related papers
- TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training [53.93696896939915]
Training tool-use agents typically rely on Supervised Fine-Tuning (SFT) on successful trajectories and Reinforcement Learning (RL) on pass-rate-selected tasks.<n>We propose TopoCurate, an interaction-aware framework that projects multi-trial rollouts from the same task into a unified semantic quotient topology.<n>TopoCurate achieves consistent gains of 4.2% (SFT) and 6.9% (RL) over state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-02T10:38:54Z) - Improving Day-Ahead Grid Carbon Intensity Forecasting by Joint Modeling of Local-Temporal and Cross-Variable Dependencies Across Different Frequencies [3.1953619915392246]
Accurate forecasting of the grid carbon intensity factor (CIF) is critical for enabling demand-side management and reducing emissions in modern electricity systems.<n>Despite advances in deep learning-based methods, it remains challenging to capture the fine-grained local-temporal dependencies.<n>We propose a novel model that integrates two parallel modules.
arXiv Detail & Related papers (2026-01-10T11:20:55Z) - TS2Vec-Ensemble: An Enhanced Self-Supervised Framework for Time Series Forecasting [1.2461503242570642]
This paper introduces TS2Vec-Ensemble, a novel hybrid framework for time series forecasting.<n>Our approach enhances the powerful, implicitly learned dynamics from a pretrained TS2Vec encoder by fusing them with explicit, engineered time features that encode periodic cycles.<n>The results demonstrate that TS2Vec-Ensemble consistently and significantly outperforms the standard TS2Vec baseline and other state-of-the-art models.
arXiv Detail & Related papers (2025-11-27T12:19:18Z) - Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - DSAT-HD: Dual-Stream Adaptive Transformer with Hybrid Decomposition for Multivariate Time Series Forecasting [14.708544628811381]
Time series forecasting is crucial for various applications, such as weather, traffic, electricity, and energy predictions.<n>Existing approaches primarily model limited time series or fixed scales, making it more challenging to capture diverse features cross different ranges.<n>We propose the Hybrid Decomposition Dual-Stream Adaptive Transformer (DSAT-HD), which integrates three key innovations to address the limitations of existing methods.
arXiv Detail & Related papers (2025-09-29T13:50:56Z) - PatchTraj: Unified Time-Frequency Representation Learning via Dynamic Patches for Trajectory Prediction [14.48846131633279]
We propose a dynamic patch-based framework that integrates time-frequency joint modeling for trajectory prediction.<n> Specifically, we decompose the trajectory into raw time sequences and frequency components, and employ dynamic patch partitioning to perform multi-scale segmentation.<n>The resulting enhanced embeddings exhibit strong expressive power, enabling accurate predictions even when using a vanilla architecture.
arXiv Detail & Related papers (2025-07-25T09:55:33Z) - AutoHFormer: Efficient Hierarchical Autoregressive Transformer for Time Series Prediction [36.239648954658534]
Time series forecasting requires architectures that simultaneously achieve three competing objectives.<n>We introduce AutoHFormer, a hierarchical autoregressive transformer that addresses these challenges.<n> Comprehensive experiments demonstrate that AutoHFormer 10.76X faster training and 6.06X memory reduction compared to PatchTST on P08.
arXiv Detail & Related papers (2025-06-19T03:47:04Z) - Nesterov Method for Asynchronous Pipeline Parallel Optimization [59.79227116582264]
We introduce a variant of Nesterov Accelerated Gradient (NAG) for asynchronous optimization in Pipeline Parallelism.<n>Specifically, we modify the look-ahead step in NAG to effectively address the staleness in gradients.<n>We theoretically prove that our approach converges at a sublinear rate in the presence of fixed delay in gradients.
arXiv Detail & Related papers (2025-05-02T08:23:29Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a causal Transformer for unified time series forecasting.<n>Based on large-scale pre-training, Timer-XL achieves state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting.<n>Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block.<n>Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z) - Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting [46.63798583414426]
Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis.
Our study demonstrates, through both analytical and empirical evidence, that decomposition is key to containing excessive model inflation.
Remarkably, by tailoring decomposition to the intrinsic dynamics of time series data, our proposed model outperforms existing benchmarks.
arXiv Detail & Related papers (2024-01-22T13:15:40Z) - TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series [57.4208255711412]
Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS)
We show that the resulting model has significantly better training dynamics and achieves state-of-the-art performance across diverse real-world forecasting tasks.
arXiv Detail & Related papers (2023-10-02T16:45:19Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation [80.33846577924363]
We present All-Pairs Multi-Field Transforms (AMT), a new network architecture for video framegithub.
It is based on two essential designs. First, we build bidirectional volumes for all pairs of pixels, and use the predicted bilateral flows to retrieve correlations.
Second, we derive multiple groups of fine-grained flow fields from one pair of updated coarse flows for performing backward warping on the input frames separately.
arXiv Detail & Related papers (2023-04-19T16:18:47Z) - Haar Wavelet based Block Autoregressive Flows for Trajectories [129.37479472754083]
Prediction of trajectories such as that of pedestrians is crucial to the performance of autonomous agents.
We introduce a novel Haar wavelet based block autoregressive model leveraging split couplings.
We illustrate the advantages of our approach for generating diverse and accurate trajectories on two real-world datasets.
arXiv Detail & Related papers (2020-09-21T13:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.