Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families
- URL: http://arxiv.org/abs/2504.10340v2
- Date: Sun, 20 Apr 2025 19:03:57 GMT
- Title: Forecasting from Clinical Textual Time Series: Adaptations of the Encoder and Decoder Language Model Families
- Authors: Shahriar Noroozizadeh, Sayantan Kumar, Jeremy C. Weiss,
- Abstract summary: We introduce the forecasting problem from textual time series, where timestamped clinical findings serve as the primary input for prediction.<n>We evaluate a diverse suite of models, including fine-tuned decoder-based large language models and encoder-based transformers.
- Score: 6.882042556551609
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Clinical case reports encode rich, temporal patient trajectories that are often underexploited by traditional machine learning methods relying on structured data. In this work, we introduce the forecasting problem from textual time series, where timestamped clinical findings -- extracted via an LLM-assisted annotation pipeline -- serve as the primary input for prediction. We systematically evaluate a diverse suite of models, including fine-tuned decoder-based large language models and encoder-based transformers, on tasks of event occurrence prediction, temporal ordering, and survival analysis. Our experiments reveal that encoder-based models consistently achieve higher F1 scores and superior temporal concordance for short- and long-horizon event forecasting, while fine-tuned masking approaches enhance ranking performance. In contrast, instruction-tuned decoder models demonstrate a relative advantage in survival analysis, especially in early prognosis settings. Our sensitivity analyses further demonstrate the importance of time ordering, which requires clinical time series construction, as compared to text ordering, the format of the text inputs that LLMs are classically trained on. This highlights the additional benefit that can be ascertained from time-ordered corpora, with implications for temporal tasks in the era of widespread LLM use.
Related papers
- Enhancing Time Series Forecasting via Multi-Level Text Alignment with LLMs [6.612196783595362]
We propose a multi-level text alignment framework for time series forecasting using large language models (LLMs)
Our method decomposes time series into trend, seasonal, and residual components, which are then reprogrammed into component-specific text representations.
Experiments on multiple datasets demonstrate that our method outperforms state-of-the-art models in accuracy while providing good interpretability.
arXiv Detail & Related papers (2025-04-10T01:02:37Z) - LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics [56.99021951927683]
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring.
Existing Large Language Models (LLMs) usually perform suboptimally because they neglect the inherent characteristics of time series data.
We propose LLM-PS to empower the LLM for TSF by learning the fundamental textitPatterns and meaningful textitSemantics from time series data.
arXiv Detail & Related papers (2025-03-12T11:45:11Z) - Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop [63.34626300024294]
TimeXL is a multi-modal prediction framework that integrates a prototype-based time series encoder.<n>It produces more accurate predictions and interpretable explanations.<n> Empirical evaluations on four real-world datasets demonstrate that TimeXL achieves up to 8.9% improvement in AUC.
arXiv Detail & Related papers (2025-03-02T20:40:53Z) - TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents [52.13094810313054]
TimeCAP is a time-series processing framework that creatively employs Large Language Models (LLMs) as contextualizers of time series data.<n>TimeCAP incorporates two independent LLM agents: one generates a textual summary capturing the context of the time series, while the other uses this enriched summary to make more informed predictions.<n> Experimental results on real-world datasets demonstrate that TimeCAP outperforms state-of-the-art methods for time series event prediction.
arXiv Detail & Related papers (2025-02-17T04:17:27Z) - Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization [74.3339999119713]
We develop a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies.<n>Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon.
arXiv Detail & Related papers (2024-12-06T18:22:59Z) - Metadata Matters for Time Series: Informative Forecasting with Transformers [70.38241681764738]
We propose a Metadata-informed Time Series Transformer (MetaTST) for time series forecasting.
To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates.
A Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information.
arXiv Detail & Related papers (2024-10-04T11:37:55Z) - An Evaluation of Standard Statistical Models and LLMs on Time Series Forecasting [16.583730806230644]
This study highlights the key challenges that large language models encounter in the context of time series prediction.
The empirical results indicate that while large language models can perform well in zero-shot forecasting for certain datasets, their predictive accuracy diminishes notably when confronted with diverse time series data and traditional signals.
arXiv Detail & Related papers (2024-08-09T05:13:03Z) - Multi-Patch Prediction: Adapting LLMs for Time Series Representation
Learning [22.28251586213348]
aLLM4TS is an innovative framework that adapts Large Language Models (LLMs) for time-series representation learning.
A distinctive element of our framework is the patch-wise decoding layer, which departs from previous methods reliant on sequence-level decoding.
arXiv Detail & Related papers (2024-02-07T13:51:26Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.