TS-TCD: Triplet-Level Cross-Modal Distillation for Time-Series Forecasting Using Large Language Models
- URL: http://arxiv.org/abs/2409.14978v1
- Date: Mon, 23 Sep 2024 12:57:24 GMT
- Title: TS-TCD: Triplet-Level Cross-Modal Distillation for Time-Series Forecasting Using Large Language Models
- Authors: Pengfei Wang, Huanran Zheng, Silong Dai, Wenjing Yue, Wei Zhu, Xiaoling Wang,
- Abstract summary: We present a novel framework, TS-TCD, which introduces a comprehensive three-tiered cross-modal knowledge distillation mechanism.
Unlike prior work that focuses on isolated alignment techniques, our framework systematically integrates.
Experiments on benchmark time-series demonstrate that TS-TCD achieves state-of-the-art results, outperforming traditional methods in both accuracy and robustness.
- Score: 15.266543423942617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, large language models (LLMs) have shown great potential in time-series analysis by capturing complex dependencies and improving predictive performance. However, existing approaches often struggle with modality alignment, leading to suboptimal results. To address these challenges, we present a novel framework, TS-TCD, which introduces a comprehensive three-tiered cross-modal knowledge distillation mechanism. Unlike prior work that focuses on isolated alignment techniques, our framework systematically integrates: 1) Dynamic Adaptive Gating for Input Encoding and Alignment}, ensuring coherent alignment between time-series tokens and QR-decomposed textual embeddings; 2) Layer-Wise Contrastive Learning}, aligning intermediate representations across modalities to reduce feature-level discrepancies; and 3) Optimal Transport-Driven Output Alignment}, which ensures consistent output predictions through fine-grained cross-modal alignment. Extensive experiments on benchmark time-series datasets demonstrate that TS-TCD achieves state-of-the-art results, outperforming traditional methods in both accuracy and robustness.
Related papers
- Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting (TSF)
Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block.
Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z) - Distillation Enhanced Time Series Forecasting Network with Momentum Contrastive Learning [7.4106801792345705]
We propose DE-TSMCL, an innovative distillation enhanced framework for long sequence time series forecasting.
Specifically, we design a learnable data augmentation mechanism which adaptively learns whether to mask a timestamp.
Then, we propose a contrastive learning task with momentum update to explore inter-sample and intra-temporal correlations of time series.
By developing model loss from multiple tasks, we can learn effective representations for downstream forecasting task.
arXiv Detail & Related papers (2024-01-31T12:52:10Z) - Mlinear: Rethink the Linear Model for Time-series Forecasting [9.841293660201261]
Mlinear is a simple yet effective method based mainly on linear layers.
We introduce a new loss function that significantly outperforms the widely used mean squared error (MSE) on multiple datasets.
Our method significantly outperforms PatchTST with a ratio of 21:3 at 336 sequence length input and 29:10 at 512 sequence length input.
arXiv Detail & Related papers (2023-05-08T15:54:18Z) - Multi-Modal Continual Test-Time Adaptation for 3D Semantic Segmentation [26.674085603033742]
Continual Test-Time Adaptation (CTTA) generalizes conventional Test-Time Adaptation (TTA) by assuming that the target domain is dynamic over time rather than stationary.
In this paper, we explore Multi-Modal Continual Test-Time Adaptation (MM-CTTA) as a new extension of CTTA for 3D semantic segmentation.
arXiv Detail & Related papers (2023-03-18T16:51:19Z) - Understanding and Constructing Latent Modality Structures in Multi-modal
Representation Learning [53.68371566336254]
We argue that the key to better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization.
arXiv Detail & Related papers (2023-03-10T14:38:49Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Contrastive predictive coding for Anomaly Detection in Multi-variate
Time Series Data [6.463941665276371]
We propose a Time-series Representational Learning through Contrastive Predictive Coding (TRL-CPC) towards anomaly detection in MVTS data.
First, we jointly optimize an encoder, an auto-regressor and a non-linear transformation function to effectively learn the representations of the MVTS data sets.
arXiv Detail & Related papers (2022-02-08T04:25:29Z) - Progressively Guide to Attend: An Iterative Alignment Framework for
Temporal Sentence Grounding [53.377028000325424]
We propose an Iterative Alignment Network (IA-Net) for temporal sentence grounding task.
We pad multi-modal features with learnable parameters to alleviate the nowhere-to-attend problem of non-matched frame-word pairs.
We also devise a calibration module following each attention module to refine the alignment knowledge.
arXiv Detail & Related papers (2021-09-14T02:08:23Z) - Self-Supervised Multi-Frame Monocular Scene Flow [61.588808225321735]
We introduce a multi-frame monocular scene flow network based on self-supervised learning.
We observe state-of-the-art accuracy among monocular scene flow methods based on self-supervised learning.
arXiv Detail & Related papers (2021-05-05T17:49:55Z) - Consistency Guided Scene Flow Estimation [159.24395181068218]
CGSF is a self-supervised framework for the joint reconstruction of 3D scene structure and motion from stereo video.
We show that the proposed model can reliably predict disparity and scene flow in challenging imagery.
It achieves better generalization than the state-of-the-art, and adapts quickly and robustly to unseen domains.
arXiv Detail & Related papers (2020-06-19T17:28:07Z) - Bottom-Up Temporal Action Localization with Mutual Regularization [107.39785866001868]
State-of-the-art solutions for TAL involve evaluating the frame-level probabilities of three action-indicating phases.
We introduce two regularization terms to mutually regularize the learning procedure.
Experiments are performed on two popular TAL datasets, THUMOS14 and ActivityNet1.3.
arXiv Detail & Related papers (2020-02-18T03:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.