Related papers: Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment

Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment

URL: http://arxiv.org/abs/2505.13175v1
Date: Mon, 19 May 2025 14:30:41 GMT
Title: Enhancing LLMs for Time Series Forecasting via Structure-Guided Cross-Modal Alignment
Authors: Siming Sun, Kai Zhang, Xuejun Jiang, Wenchao Meng, Qinmin Yang,
Abstract summary: We propose a framework that exploits and aligns the state-transition graph structures shared by time-series and linguistic data as sequential modalities.<n> Experiments on multiple benchmarks demonstrate that SGCMA achieves state-of-the-art performance.
Score: 12.319685395140862
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emerging paradigm of leveraging pretrained large language models (LLMs) for time series forecasting has predominantly employed linguistic-temporal modality alignment strategies through token-level or layer-wise feature mapping. However, these approaches fundamentally neglect a critical insight: the core competency of LLMs resides not merely in processing localized token features but in their inherent capacity to model holistic sequence structures. This paper posits that effective cross-modal alignment necessitates structural consistency at the sequence level. We propose the Structure-Guided Cross-Modal Alignment (SGCMA), a framework that fully exploits and aligns the state-transition graph structures shared by time-series and linguistic data as sequential modalities, thereby endowing time series with language-like properties and delivering stronger generalization after modality alignment. SGCMA consists of two key components, namely Structure Alignment and Semantic Alignment. In Structure Alignment, a state transition matrix is learned from text data through Hidden Markov Models (HMMs), and a shallow transformer-based Maximum Entropy Markov Model (MEMM) receives the hot-start transition matrix and annotates each temporal patch into state probability, ensuring that the temporal representation sequence inherits language-like sequential dynamics. In Semantic Alignment, cross-attention is applied between temporal patches and the top-k tokens within each state, and the ultimate temporal embeddings are derived by the expected value of these embeddings using a weighted average based on state probabilities. Experiments on multiple benchmarks demonstrate that SGCMA achieves state-of-the-art performance, offering a novel approach to cross-modal alignment in time series forecasting.

Related papers

Reprogramming Vision Foundation Models for Spatio-Temporal Forecasting [12.591771385493509]
We present textST-VFM, a framework that systematically reprograms Vision Foundation Models (VFMs) for general-purpose robustness-temporal forecasting.<n>The framework integrates raw inputs with auxiliary ST flow, where the flow encodes lightweight temporal difference signals interpretable as dynamic cues.<n>The emphpre-VFM reprogramming applies a Temporal-Aware Token to align both branches into VFM-compatible feature spaces.<n>The emphpost-VFM reprogramming introduces a Bilateral CrossPrompt Coordination module, enabling dynamic interaction between branches.
arXiv Detail & Related papers (2025-07-14T08:33:34Z)
SEED: A Structural Encoder for Embedding-Driven Decoding in Time Series Prediction with LLMs [3.036179638516407]
We introduce SEED, a structural encoder for embedding-driven decoding, which integrates four stages: a token-aware encoder for patch extraction, a projection module that aligns patches with language model embeddings, and a semantic reprogramming mechanism that maps patches to task-aware prototypes.<n>This modular architecture decouples representation learning from inference, enabling efficient alignment between numerical patterns and semantic reasoning.
arXiv Detail & Related papers (2025-06-25T06:40:14Z)
Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z)
Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting [26.59526791215]
We identify two key challenges in cross-domain time series forecasting: the complexity of temporal patterns and semantic misalignment.<n>We propose the Unify and Anchor" transfer paradigm, which disentangles frequency components for a unified perspective.<n>We introduce ContexTST, a Transformer-based model that employs a time series coordinator for structured representation.
arXiv Detail & Related papers (2025-03-03T04:11:14Z)
PICASO: Permutation-Invariant Context Composition with State Space Models [98.91198288025117]
State Space Models (SSMs) offer a promising solution by allowing a database of contexts to be mapped onto fixed-dimensional states.<n>We propose a simple mathematical relation derived from SSM dynamics to compose multiple states into one that efficiently approximates the effect of concatenating raw context tokens.<n>We evaluate our resulting method on WikiText and MSMARCO in both zero-shot and fine-tuned settings, and show that we can match the strongest performing baseline while enjoying on average 5.4x speedup.
arXiv Detail & Related papers (2025-02-24T19:48:00Z)
Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification [4.5939667818289385]
HiTime is a hierarchical multi-modal model that seamlessly integrates temporal information into large language models. Our findings highlight the potential of integrating temporal features into LLMs, paving the way for advanced time series analysis.
arXiv Detail & Related papers (2024-10-24T12:32:19Z)
Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning [22.28251586213348]
aLLM4TS is an innovative framework that adapts Large Language Models (LLMs) for time-series representation learning. A distinctive element of our framework is the patch-wise decoding layer, which departs from previous methods reliant on sequence-level decoding.
arXiv Detail & Related papers (2024-02-07T13:51:26Z)
Disentangling Structured Components: Towards Adaptive, Interpretable and Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns. SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns. Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z)
FormerTime: Hierarchical Multi-Scale Representations for Multivariate Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task. It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z)
Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z)
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z)
Structure-aware Fine-tuning of Sequence-to-sequence Transformers for Transition-based AMR Parsing [20.67024416678313]
We explore the integration of general pre-trained sequence-to-sequence language models and a structure-aware transition-based approach. We propose a simplified transition set, designed to better exploit pre-trained language models for structured fine-tuning. We show that the proposed parsing architecture retains the desirable properties of previous transition-based approaches, while being simpler and reaching the new state of the art for AMR 2.0, without the need for graph re-categorization.
arXiv Detail & Related papers (2021-10-29T04:36:31Z)
Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity. Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.