Enhancing Zero-Shot Time Series Forecasting in Off-the-Shelf LLMs via Noise Injection
- URL: http://arxiv.org/abs/2512.20140v1
- Date: Tue, 23 Dec 2025 08:02:33 GMT
- Title: Enhancing Zero-Shot Time Series Forecasting in Off-the-Shelf LLMs via Noise Injection
- Authors: Xingyou Yin, Ceyao Zhang, Min Hu, Kai Chen,
- Abstract summary: Large Language Models (LLMs) have demonstrated effectiveness as zero-shot time series (TS) forecasters.<n>The key challenge lies in tokenizing TS data into textual representations that align with LLMs' pre-trained knowledge.<n>We introduce two novel TS datasets that fall outside all utilized LLMs' pre-training scopes, and consistently observe improved performance.
- Score: 18.267727687739853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have demonstrated effectiveness as zero-shot time series (TS) forecasters. The key challenge lies in tokenizing TS data into textual representations that align with LLMs' pre-trained knowledge. While existing work often relies on fine-tuning specialized modules to bridge this gap, a distinct, yet challenging, paradigm aims to leverage truly off-the-shelf LLMs without any fine-tuning whatsoever, relying solely on strategic tokenization of numerical sequences. The performance of these fully frozen models is acutely sensitive to the textual representation of the input data, as their parameters cannot adapt to distribution shifts. In this paper, we introduce a simple yet highly effective strategy to overcome this brittleness: injecting noise into the raw time series before tokenization. This non-invasive intervention acts as a form of inference-time augmentation, compelling the frozen LLM to extrapolate based on robust underlying temporal patterns rather than superficial numerical artifacts. We theoretically analyze this phenomenon and empirically validate its effectiveness across diverse benchmarks. Notably, to fully eliminate potential biases from data contamination during LLM pre-training, we introduce two novel TS datasets that fall outside all utilized LLMs' pre-training scopes, and consistently observe improved performance. This study provides a further step in directly leveraging off-the-shelf LLMs for time series forecasting.
Related papers
- T-LLM: Teaching Large Language Models to Forecast Time Series via Temporal Distillation [7.6933817667680096]
Time series forecasting plays a critical role in decision-making across many real-world applications.<n>We propose T-LLM, a temporal distillation framework that equips general-purpose language models with time series forecasting capability.
arXiv Detail & Related papers (2026-02-02T10:40:27Z) - Is More Context Always Better? Examining LLM Reasoning Capability for Time Interval Prediction [15.45305246863211]
Large Language Models (LLMs) have demonstrated impressive capabilities in reasoning and prediction across different domains.<n>This paper presents a systematic study investigating whether LLMs can predict time intervals between recurring user actions.<n>We benchmark state-of-the-art LLMs in zero-shot settings against both statistical and machine-learning models.
arXiv Detail & Related papers (2026-01-15T07:18:40Z) - Semantic-Enhanced Time-Series Forecasting via Large Language Models [20.383296465541758]
Time series forecasting plays a significant role in finance, energy, meteorology, and IoT applications.<n>Recent studies have leveraged the generalization capabilities of large language models (LLMs) to adapt to time series forecasting, achieving promising performance.<n>We propose a novel Semantic-Enhanced LLM (SE-LLM) that explores the inherent periodicity and anomalous characteristics of time series to embed into the semantic space.
arXiv Detail & Related papers (2025-08-11T07:19:21Z) - Revisiting LLMs as Zero-Shot Time-Series Forecasters: Small Noise Can Break Large Models [32.30528039193554]
Large Language Models (LLMs) have shown remarkable performance across diverse tasks without domain-specific training.<n>Recent studies suggest that LLMs lack inherent effectiveness in forecasting.<n>Our experiments show that LLM-based zero-shot forecasters often struggle to achieve high accuracy due to their sensitivity to noise.
arXiv Detail & Related papers (2025-05-31T08:24:01Z) - LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics [56.99021951927683]
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring.<n>Existing Large Language Models (LLMs) usually perform suboptimally because they neglect the inherent characteristics of time series data.<n>We propose LLM-PS to empower the LLM for TSF by learning the fundamental textitPatterns and meaningful textitSemantics from time series data.
arXiv Detail & Related papers (2025-03-12T11:45:11Z) - END: Early Noise Dropping for Efficient and Effective Context Denoising [60.24648712022382]
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks.<n>They are often distracted by irrelevant or noisy context in input sequences that degrades output quality.<n>We introduce Early Noise Dropping (textscEND), a novel approach to mitigate this issue without requiring fine-tuning the LLMs.
arXiv Detail & Related papers (2025-02-26T08:07:17Z) - Mitigating Forgetting in LLM Fine-Tuning via Low-Perplexity Token Learning [65.23593936798662]
We show that fine-tuning with LLM-generated data improves target task performance and reduces non-target task degradation.<n>This is the first work to provide an empirical explanation based on token perplexity reduction to mitigate catastrophic forgetting in LLMs after fine-tuning.
arXiv Detail & Related papers (2025-01-24T08:18:56Z) - Temporal Scaling Law for Large Language Models [70.74571133406958]
We propose the novel concept of Temporal Scaling Law, studying how the test loss of an LLM evolves as the training steps scale up.<n>In contrast to modeling the test loss as a whole in a coarse-grained manner, we break it down and dive into the fine-grained test loss of each token position.<n>We derive the much more precise temporal scaling law by studying the temporal patterns of the parameters in the dynamic hyperbolic-law.
arXiv Detail & Related papers (2024-04-27T05:49:11Z) - CALF: Aligning LLMs for Time Series Forecasting via Cross-modal Fine-Tuning [59.88924847995279]
We propose a novel Cross-Modal LLM Fine-Tuning (CALF) framework for MTSF.<n>To reduce the distribution discrepancy, we develop the cross-modal match module.<n>CALF establishes state-of-the-art performance for both long-term and short-term forecasting tasks.
arXiv Detail & Related papers (2024-03-12T04:04:38Z) - Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities [46.02234423159257]
Large language models (LLMs) have been applied in many fields and have developed rapidly in recent years.<n>Recent works treat large language models as emphzero-shot time series reasoners without further fine-tuning.<n>Our study shows that LLMs perform well in predicting time series with clear patterns and trends, but face challenges with datasets lacking periodicity.
arXiv Detail & Related papers (2024-02-16T17:15:28Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.