Beyond Prompting: Time2Lang -- Bridging Time-Series Foundation Models and Large Language Models for Health Sensing
- URL: http://arxiv.org/abs/2502.07608v2
- Date: Wed, 12 Feb 2025 02:52:28 GMT
- Title: Beyond Prompting: Time2Lang -- Bridging Time-Series Foundation Models and Large Language Models for Health Sensing
- Authors: Arvind Pillai, Dimitris Spathis, Subigya Nepal, Amanda C Collins, Daniel M Mackin, Michael V Heinz, Tess Z Griffin, Nicholas C Jacobson, Andrew Campbell,
- Abstract summary: Large language models (LLMs) show promise for health applications when combined with behavioral sensing data.
Traditional approaches convert sensor data into text prompts, but this process is prone to errors, computationally expensive, and requires domain expertise.
Here, we present Time2Lang, a framework that directly maps TFM outputs to LLM representations without intermediate text conversion.
- Score: 3.2688127177376227
- License:
- Abstract: Large language models (LLMs) show promise for health applications when combined with behavioral sensing data. Traditional approaches convert sensor data into text prompts, but this process is prone to errors, computationally expensive, and requires domain expertise. These challenges are particularly acute when processing extended time series data. While time series foundation models (TFMs) have recently emerged as powerful tools for learning representations from temporal data, bridging TFMs and LLMs remains challenging. Here, we present Time2Lang, a framework that directly maps TFM outputs to LLM representations without intermediate text conversion. Our approach first trains on synthetic data using periodicity prediction as a pretext task, followed by evaluation on mental health classification tasks. We validate Time2Lang on two longitudinal wearable and mobile sensing datasets: daily depression prediction using step count data (17,251 days from 256 participants) and flourishing classification based on conversation duration (46 participants over 10 weeks). Time2Lang maintains near constant inference times regardless of input length, unlike traditional prompting methods. The generated embeddings preserve essential time-series characteristics such as auto-correlation. Our results demonstrate that TFMs and LLMs can be effectively integrated while minimizing information loss and enabling performance transfer across these distinct modeling paradigms. To our knowledge, we are the first to integrate a TFM and an LLM for health, thus establishing a foundation for future research combining general-purpose large models for complex healthcare tasks.
Related papers
- TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents [52.13094810313054]
TimeCAP is a time-series processing framework that creatively employs Large Language Models (LLMs) as contextualizers of time series data.
TimeCAP incorporates two independent LLM agents: one generates a textual summary capturing the context of the time series, while the other uses this enriched summary to make more informed predictions.
Experimental results on real-world datasets demonstrate that TimeCAP outperforms state-of-the-art methods for time series event prediction.
arXiv Detail & Related papers (2025-02-17T04:17:27Z) - Time Series Language Model for Descriptive Caption Generation [11.796431549951055]
We introduce TSLM, a novel time series language model designed specifically for time series captioning.
TSLM operates as an encoder-decoder model, leveraging both text prompts and time series data representations.
We show that TSLM outperforms existing state-of-the-art approaches from multiple data modalities by a significant margin.
arXiv Detail & Related papers (2025-01-03T14:34:30Z) - Rethinking Time Series Forecasting with LLMs via Nearest Neighbor Contrastive Learning [1.7892194562398749]
We propose NNCL-TLLM: Nearest Neighbor Contrastive Learning for Time series forecasting via Large Language Models.
First, we generate time series compatible text prototypes such that each text prototype represents both word token embeddings in its neighborhood and time series characteristics.
We then fine-tune the layer normalization and positional embeddings of the LLM, keeping the other layers intact, reducing the trainable parameters and decreasing the computational cost.
arXiv Detail & Related papers (2024-12-06T06:32:47Z) - Metadata Matters for Time Series: Informative Forecasting with Transformers [70.38241681764738]
We propose a Metadata-informed Time Series Transformer (MetaTST) for time series forecasting.
To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates.
A Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information.
arXiv Detail & Related papers (2024-10-04T11:37:55Z) - An Evaluation of Standard Statistical Models and LLMs on Time Series Forecasting [16.583730806230644]
This study highlights the key challenges that large language models encounter in the context of time series prediction.
The empirical results indicate that while large language models can perform well in zero-shot forecasting for certain datasets, their predictive accuracy diminishes notably when confronted with diverse time series data and traditional signals.
arXiv Detail & Related papers (2024-08-09T05:13:03Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - AutoTimes: Autoregressive Time Series Forecasters via Large Language Models [67.83502953961505]
AutoTimes projects time series into the embedding space of language tokens and autoregressively generates future predictions with arbitrary lengths.
We formulate time series as prompts, extending the context for prediction beyond the lookback window.
AutoTimes achieves state-of-the-art with 0.1% trainable parameters and over $5times$ training/inference speedup.
arXiv Detail & Related papers (2024-02-04T06:59:21Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Extend and Explain: Interpreting Very Long Language Models [0.0]
We introduce a novel Masked Sampling Procedure (MSP) to identify the text blocks that contribute to a prediction.
MSP identifies 1.7x more clinically informative text blocks than the previous state-of-the-art, runs up to 100x faster, and is tractable for generating important phrase pairs.
arXiv Detail & Related papers (2022-09-02T17:15:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.