vLinear: A Powerful Linear Model for Multivariate Time Series Forecasting
- URL: http://arxiv.org/abs/2601.13768v1
- Date: Tue, 20 Jan 2026 09:23:10 GMT
- Title: vLinear: A Powerful Linear Model for Multivariate Time Series Forecasting
- Authors: Wenzhen Yue, Ruohao Guo, Ji Shi, Zihan Hao, Shiyu Hu, Xianghua Ying,
- Abstract summary: vecTrans is a lightweight module that utilizes a learnable vector to model multivariate correlations.<n>WFMLoss is an effective plug-and-play objective, consistently improving existing forecasters.
- Score: 28.587343014443576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present \textbf{vLinear}, an effective yet efficient \textbf{linear}-based multivariate time series forecaster featuring two components: the \textbf{v}ecTrans module and the WFMLoss objective. Many state-of-the-art forecasters rely on self-attention or its variants to capture multivariate correlations, typically incurring $\mathcal{O}(N^2)$ computational complexity with respect to the number of variates $N$. To address this, we propose vecTrans, a lightweight module that utilizes a learnable vector to model multivariate correlations, reducing the complexity to $\mathcal{O}(N)$. Notably, vecTrans can be seamlessly integrated into Transformer-based forecasters, delivering up to 5$\times$ inference speedups and consistent performance gains. Furthermore, we introduce WFMLoss (Weighted Flow Matching Loss) as the objective. In contrast to typical \textbf{velocity-oriented} flow matching objectives, we demonstrate that a \textbf{final-series-oriented} formulation yields significantly superior forecasting accuracy. WFMLoss also incorporates path- and horizon-weighted strategies to focus learning on more reliable paths and horizons. Empirically, vLinear achieves state-of-the-art performance across 22 benchmarks and 124 forecasting settings. Moreover, WFMLoss serves as an effective plug-and-play objective, consistently improving existing forecasters. The code is available at https://anonymous.4open.science/r/vLinear.
Related papers
- Trajectory Consistency for One-Step Generation on Euler Mean Flows [24.038760671907024]
We propose emphEuler Mean Flows (EMF), a flow-based generative framework for one-step and few-step generation.<n>EMF enforces long-range trajectory consistency with minimal sampling cost.
arXiv Detail & Related papers (2026-01-31T04:32:32Z) - Efficiency vs. Fidelity: A Comparative Analysis of Diffusion Probabilistic Models and Flow Matching on Low-Resource Hardware [0.0]
Denoising Diffusion Probabilistic Models (DDPMs) have established a new state-of-the-art in generative image synthesis.<n>This study presents a comparative analysis of DDPMs against the emerging Flow Matching paradigm.
arXiv Detail & Related papers (2025-11-24T18:19:42Z) - Test time training enhances in-context learning of nonlinear functions [51.56484100374058]
Test-time training (TTT) enhances model performance by explicitly updating designated parameters prior to each prediction.<n>We investigate the combination of TTT with in-context learning (ICL), where the model is given a few examples from the target distribution at inference time.
arXiv Detail & Related papers (2025-09-30T03:56:44Z) - PT$^2$-LLM: Post-Training Ternarization for Large Language Models [52.4629647715623]
Large Language Models (LLMs) have shown impressive capabilities across diverse tasks, but their large memory and compute demands hinder deployment.<n>We propose PT$2$-LLM, a post-training ternarization framework tailored for LLMs.<n>At its core is an Asymmetric Ternary Quantizer equipped with a two-stage refinement pipeline.
arXiv Detail & Related papers (2025-09-27T03:01:48Z) - MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation [74.34220141721231]
We present MPQ-DMv2, an improved textbfMixed textbfPrecision textbfQuantization framework for extremely low-bit textbfDiffusion textbfModels.
arXiv Detail & Related papers (2025-07-06T08:16:50Z) - Multi-Scale Finetuning for Encoder-based Time Series Foundation Models [67.95907033226585]
Time series foundation models (TSFMs) demonstrate impressive zero-shot performance for time series forecasting.<n>While naive finetuning can yield performance gains, we argue that it falls short of fully leveraging TSFMs' capabilities.<n>We propose Multiscale finetuning (MSFT), a simple yet general framework that explicitly integrates multi-scale modeling into the finetuning process.
arXiv Detail & Related papers (2025-06-17T01:06:01Z) - MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models [53.36415620647177]
Semi-structured sparsity offers a promising solution by strategically retaining $N$ elements out of every $M$ weights.<n>Existing (N:M)-compatible approaches typically fall into two categories: rule-based layerwise greedy search, which suffers from considerable errors, and gradient-driven learning, which incurs prohibitive training costs.<n>We propose a novel linear-space probabilistic framework named MaskPro, which aims to learn a prior categorical distribution for every $M$ consecutive weights and subsequently leverages this distribution to generate the (N:M)-sparsity throughout an $N$-way sampling
arXiv Detail & Related papers (2025-06-15T15:02:59Z) - Sonnet: Spectral Operator Neural Network for Multivariable Time Series Forecasting [0.34530027457862006]
We propose a novel architecture, namely the Spectral Operator Neural Network (Sonnet)<n>Sonnet applies learnable wavelet transformations to the input and incorporates spectral analysis using the Koopman operator.<n>Our empirical analysis shows that Sonnet yields the best performance on $34$ out of $47$ forecasting tasks.
arXiv Detail & Related papers (2025-05-21T09:43:12Z) - OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain [24.24834151329251]
OLinear is a $mathbfo$rthogonally transformed domain that operates in a $mathbfo$rthogonally transformed domain.<n>We introduce a customized linear layer, $mathbfNormLin$, which employs a normalized weight matrix to capture multivariate dependencies.<n>Experiments on 24 benchmarks and 140 forecasting tasks demonstrate that OLinear consistently achieves state-of-the-art performance with high efficiency.
arXiv Detail & Related papers (2025-05-12T10:39:37Z) - Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models.<n>Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z) - VE: Modeling Multivariate Time Series Correlation with Variate Embedding [0.4893345190925178]
Current channel-independent (CI) models and models with a CI final projection layer are unable to capture correlations.
We present the variate embedding (VE) pipeline, which learns a unique and consistent embedding for each variate.
The VE pipeline can be integrated into any model with a CI final projection layer to improve multivariate forecasting.
arXiv Detail & Related papers (2024-09-10T02:49:30Z) - DeGMix: Efficient Multi-Task Dense Prediction with Deformable and Gating Mixer [129.61363098633782]
We present an efficient multi-task dense prediction with deformable and gating mixer (DeGMix)<n>The proposed DeGMix uses fewer GFLOPs and significantly outperforms current Transformer-based and CNN-based competitive models.
arXiv Detail & Related papers (2023-08-10T17:37:49Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Transformers meet Stochastic Block Models: Attention with Data-Adaptive
Sparsity and Cost [53.746169882193456]
Recent works have proposed various sparse attention modules to overcome the quadratic cost of self-attention.
We propose a model that resolves both problems by endowing each attention head with a mixed-membership Block Model.
Our model outperforms previous efficient variants as well as the original Transformer with full attention.
arXiv Detail & Related papers (2022-10-27T15:30:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.