TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval
- URL: http://arxiv.org/abs/2506.09114v1
- Date: Tue, 10 Jun 2025 17:59:56 GMT
- Title: TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval
- Authors: Jialin Chen, Ziyu Zhao, Gaukhar Nurbek, Aosong Feng, Ali Maatouk, Leandros Tassiulas, Yifeng Gao, Rex Ying,
- Abstract summary: TRACE is a generic multimodal retriever that grounds time-series embeddings in aligned textual context.<n>It supports flexible cross-modal retrieval modes, including Text-to-Timeseries and Timeseries-to-Text.<n>TRACE also serves as a powerful standalone encoder, with lightweight task-specific tuning that refines context-aware representations.
- Score: 22.412759868927118
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The ubiquity of dynamic data in domains such as weather, healthcare, and energy underscores a growing need for effective interpretation and retrieval of time-series data. These data are inherently tied to domain-specific contexts, such as clinical notes or weather narratives, making cross-modal retrieval essential not only for downstream tasks but also for developing robust time-series foundation models by retrieval-augmented generation (RAG). Despite the increasing demand, time-series retrieval remains largely underexplored. Existing methods often lack semantic grounding, struggle to align heterogeneous modalities, and have limited capacity for handling multi-channel signals. To address this gap, we propose TRACE, a generic multimodal retriever that grounds time-series embeddings in aligned textual context. TRACE enables fine-grained channel-level alignment and employs hard negative mining to facilitate semantically meaningful retrieval. It supports flexible cross-modal retrieval modes, including Text-to-Timeseries and Timeseries-to-Text, effectively linking linguistic descriptions with complex temporal patterns. By retrieving semantically relevant pairs, TRACE enriches downstream models with informative context, leading to improved predictive accuracy and interpretability. Beyond a static retrieval engine, TRACE also serves as a powerful standalone encoder, with lightweight task-specific tuning that refines context-aware representations while maintaining strong cross-modal alignment. These representations achieve state-of-the-art performance on downstream forecasting and classification tasks. Extensive experiments across multiple domains highlight its dual utility, as both an effective encoder for downstream applications and a general-purpose retriever to enhance time-series models.
Related papers
- Multivariate Long-term Time Series Forecasting with Fourier Neural Filter [55.09326865401653]
We introduce FNF as the backbone and DBD as architecture to provide excellent learning capabilities and optimal learning pathways for spatial-temporal modeling.<n>We show that FNF unifies local time-domain and global frequency-domain information processing within a single backbone that extends naturally to spatial modeling.
arXiv Detail & Related papers (2025-06-10T18:40:20Z) - FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series Classification [56.925103708982164]
We present a novel perspective from the frequency domain and identify three advantages for downstream classification: global, independent, and compact.<n>We propose the lightweight yet effective Frequency Refined Augmentation (FreRA) tailored for time series contrastive learning on classification tasks.<n>FreRA consistently outperforms ten leading baselines on time series classification, anomaly detection, and transfer learning tasks.
arXiv Detail & Related papers (2025-05-29T07:18:28Z) - STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization [34.53308463024231]
We propose an innovative Spatio-Temporal Retrieval-Augmented Pattern Learning framework, STRAP.<n>During inference, STRAP retrieves relevant patterns from this library based on similarity to the current input and injects them into the model via a plug-and-play prompting mechanism.<n>Experiments across multiple real-world streaming graph datasets show that STRAP consistently outperforms state-of-the-art STGNN baselines on STOOD tasks.
arXiv Detail & Related papers (2025-05-26T06:11:05Z) - Unify and Anchor: A Context-Aware Transformer for Cross-Domain Time Series Forecasting [26.59526791215]
We identify two key challenges in cross-domain time series forecasting: the complexity of temporal patterns and semantic misalignment.<n>We propose the Unify and Anchor" transfer paradigm, which disentangles frequency components for a unified perspective.<n>We introduce ContexTST, a Transformer-based model that employs a time series coordinator for structured representation.
arXiv Detail & Related papers (2025-03-03T04:11:14Z) - Understanding Synthetic Context Extension via Retrieval Heads [51.8869530817334]
We investigate fine-tuning on synthetic data for three long-context tasks that require retrieval and reasoning.<n>We find that models trained on synthetic data fall short of the real data, but surprisingly, the mismatch can be interpreted.<n>Our results shed light on how to interpret synthetic data fine-tuning performance and how to approach creating better data for learning real-world capabilities over long contexts.
arXiv Detail & Related papers (2024-10-29T17:55:00Z) - Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification [4.5939667818289385]
HiTime is a hierarchical multi-modal model that seamlessly integrates temporal information into large language models.
Our findings highlight the potential of integrating temporal features into LLMs, paving the way for advanced time series analysis.
arXiv Detail & Related papers (2024-10-24T12:32:19Z) - Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding [57.62275091656578]
We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE)
This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event chain within TCE.
arXiv Detail & Related papers (2024-06-04T16:42:17Z) - TimeGraphs: Graph-based Temporal Reasoning [64.18083371645956]
TimeGraphs is a novel approach that characterizes dynamic interactions as a hierarchical temporal graph.
Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales.
We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset.
arXiv Detail & Related papers (2024-01-06T06:26:49Z) - Unsupervised Representation Learning for Time Series with Temporal
Neighborhood Coding [8.45908939323268]
We propose a self-supervised framework for learning generalizable representations for non-stationary time series.
Our motivation stems from the medical field, where the ability to model the dynamic nature of time series data is especially valuable.
arXiv Detail & Related papers (2021-06-01T19:53:24Z) - Interpretable Time-series Representation Learning With Multi-Level
Disentanglement [56.38489708031278]
Disentangle Time Series (DTS) is a novel disentanglement enhancement framework for sequential data.
DTS generates hierarchical semantic concepts as the interpretable and disentangled representation of time-series.
DTS achieves superior performance in downstream applications, with high interpretability of semantic concepts.
arXiv Detail & Related papers (2021-05-17T22:02:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.