Related papers: TimeSeriesExam: A time series understanding exam

TimeSeriesExam: A time series understanding exam

URL: http://arxiv.org/abs/2410.14752v1
Date: Fri, 18 Oct 2024 02:37:14 GMT
Title: TimeSeriesExam: A time series understanding exam
Authors: Yifu Cai, Arjun Choudhry, Mononito Goswami, Artur Dubrawski,
Abstract summary: TimeSeriesExam comprises of over 700 questions, procedurally generated using 104 carefully curated templates. We test 7 state-of-the-art LLMs on the TimeSeriesExam and provide the first comprehensive evaluation of their time series understanding abilities.
Score: 18.06147400795917
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have recently demonstrated a remarkable ability to model time series data. These capabilities can be partly explained if LLMs understand basic time series concepts. However, our knowledge of what these models understand about time series data remains relatively limited. To address this gap, we introduce TimeSeriesExam, a configurable and scalable multiple-choice question exam designed to assess LLMs across five core time series understanding categories: pattern recognition, noise understanding, similarity analysis, anomaly detection, and causality analysis. TimeSeriesExam comprises of over 700 questions, procedurally generated using 104 carefully curated templates and iteratively refined to balance difficulty and their ability to discriminate good from bad models. We test 7 state-of-the-art LLMs on the TimeSeriesExam and provide the first comprehensive evaluation of their time series understanding abilities. Our results suggest that closed-source models such as GPT-4 and Gemini understand simple time series concepts significantly better than their open-source counterparts, while all models struggle with complex concepts such as causality analysis. We believe that the ability to programatically generate questions is fundamental to assessing and improving LLM's ability to understand and reason about time series data.

Related papers

LLM-PS: Empowering Large Language Models for Time Series Forecasting with Temporal Patterns and Semantics [56.99021951927683]
Time Series Forecasting (TSF) is critical in many real-world domains like financial planning and health monitoring. Existing Large Language Models (LLMs) usually perform suboptimally because they neglect the inherent characteristics of time series data. We propose LLM-PS to empower the LLM for TSF by learning the fundamental textitPatterns and meaningful textitSemantics from time series data.
arXiv Detail & Related papers (2025-03-12T11:45:11Z)
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement [55.2439260314328]
Time Series Multi-Task Question Answering (Time-MQA) is a unified framework that enables natural language queries across multiple time series tasks. Central to Time-MQA is the TSQA dataset, a large-scale dataset containing $sim $200k question-answer pairs.
arXiv Detail & Related papers (2025-02-26T13:47:13Z)
General Time-series Model for Universal Knowledge Representation of Multivariate Time-Series data [61.163542597764796]
We show that time series with different time granularities (or corresponding frequency resolutions) exhibit distinct joint distributions in the frequency domain. A novel Fourier knowledge attention mechanism is proposed to enable learning time-aware representations from both the temporal and frequency domains. An autoregressive blank infilling pre-training framework is incorporated to time series analysis for the first time, leading to a generative tasks agnostic pre-training strategy.
arXiv Detail & Related papers (2025-02-05T15:20:04Z)
Can Multimodal LLMs do Visual Temporal Understanding and Reasoning? The answer is No! [22.75945626401567]
We propose a challenging evaluation benchmark named TemporalVQA. The first part requires MLLMs to determine the sequence of events by analyzing temporally consecutive video frames. The second part presents image pairs with varying time differences, framed as multiple-choice questions, asking MLLMs to estimate the time-lapse between images with options ranging from seconds to years. Our evaluations of advanced MLLMs, including models like GPT-4o and Gemini-1.5-Pro, reveal significant challenges.
arXiv Detail & Related papers (2025-01-18T06:41:48Z)
ChatTS: Aligning Time Series with LLMs via Synthetic Data for Enhanced Understanding and Reasoning [10.854285913078257]
This paper introduces ChatTS, a novel MLLM designed for time series analysis. ChatTS treats time series as a modality, similar to how vision MLLMs process images. Time Series Evol-Instruct generates diverse time series Q&As, enhancing the model's reasoning capabilities.
arXiv Detail & Related papers (2024-12-04T08:06:15Z)
Can LLMs Understand Time Series Anomalies? [20.848375315326305]
Large Language Models (LLMs) have gained popularity in time series forecasting, but their potential for anomaly detection remains largely unexplored. Our study investigates whether LLMs can understand and detect anomalies in time series data, focusing on zero-shot and few-shot scenarios. Our results suggest that while LLMs can understand time series anomalies, many common conjectures based on their reasoning capabilities do not hold.
arXiv Detail & Related papers (2024-10-07T19:16:02Z)
Can LLMs Serve As Time Series Anomaly Detectors? [33.28502093260832]
An emerging topic in large language models (LLMs) is their application to time series forecasting. In this paper, we investigate the capabilities of LLMs, specifically GPT-4 and LLaMA3, in detecting and explaining anomalies in time series.
arXiv Detail & Related papers (2024-08-06T23:14:39Z)
Deep Time Series Models: A Comprehensive Survey and Benchmark [74.28364194333447]
Time series data is of great significance in real-world scenarios. Recent years have witnessed remarkable breakthroughs in the time series community. We release Time Series Library (TSLib) as a fair benchmark of deep time series models for diverse analysis tasks.
arXiv Detail & Related papers (2024-07-18T08:31:55Z)
Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark [13.490168087823992]
Large Language Models (LLMs) offer the potential for automatic time series analysis and reporting. We introduce a comprehensive taxonomy of time series features, a critical framework that delineates various characteristics inherent in time series data. This dataset acts as a solid foundation for assessing the proficiency of LLMs in comprehending time series.
arXiv Detail & Related papers (2024-04-25T12:24:37Z)
Chronos: Learning the Language of Time Series [79.38691251254173]
Chronos is a framework for pretrained probabilistic time series models. We show that Chronos models can leverage time series data from diverse domains to improve zero-shot accuracy on unseen forecasting tasks.
arXiv Detail & Related papers (2024-03-12T16:53:54Z)
MOMENT: A Family of Open Time-series Foundation Models [19.0845213853369]
We introduce MOMENT, a family of open-source foundation models for general-purpose time series analysis. We compile a collection of public time series, called the Time series Pile, and systematically tackle time series-specific challenges. We build on recent work to design a benchmark to evaluate time series foundation models on diverse tasks and datasets in limited supervision settings.
arXiv Detail & Related papers (2024-02-06T10:48:46Z)
Empowering Time Series Analysis with Large Language Models: A Survey [24.202539098675953]
We provide a systematic overview of methods that leverage large language models for time series analysis. Specifically, we first state the challenges and motivations of applying language models in the context of time series. Next, we categorize existing methods into different groups (i.e., direct query, tokenization, prompt design, fine-tune, and model integration) and highlight the key ideas within each group.
arXiv Detail & Related papers (2024-02-05T16:46:35Z)
Position: What Can Large Language Models Tell Us about Time Series Analysis [69.70906014827547]
We argue that current large language models (LLMs) have the potential to revolutionize time series analysis. Such advancement could unlock a wide range of possibilities, including time series modality switching and question answering.
arXiv Detail & Related papers (2024-02-05T04:17:49Z)
Large Language Models Are Zero-Shot Time Series Forecasters [48.73953666153385]
By encoding time series as a string of numerical digits, we can frame time series forecasting as next-token prediction in text. We find that large language models (LLMs) such as GPT-3 and LLaMA-2 can surprisingly zero-shot extrapolate time series at a level comparable to or exceeding the performance of purpose-built time series models trained on the downstream tasks.
arXiv Detail & Related papers (2023-10-11T19:01:28Z)
Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems. We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting. Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z)
Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects [84.6945070729684]
Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. This article reviews current state-of-the-art SSL methods for time series data.
arXiv Detail & Related papers (2023-06-16T18:23:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.