Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data
- URL: http://arxiv.org/abs/2409.16647v1
- Date: Wed, 25 Sep 2024 06:04:03 GMT
- Title: Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data
- Authors: Kota Dohi, Aoi Ito, Harsh Purohit, Tomoya Nishida, Takashi Endo, Yohei Kawaguchi,
- Abstract summary: We propose a method to generate domain-independent descriptive texts from time-series data.
By implementing the novel backward approach, we create the Temporal Automated Captions for Observations dataset.
Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.
- Score: 5.264562311559751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to scarcity of time-series data annotated with descriptive texts, training a model to generate descriptive texts for time-series data is challenging. In this study, we propose a method to systematically generate domain-independent descriptive texts from time-series data. We identify two distinct approaches for creating pairs of time-series data and descriptive texts: the forward approach and the backward approach. By implementing the novel backward approach, we create the Temporal Automated Captions for Observations (TACO) dataset. Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.
Related papers
- TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents [52.13094810313054]
TimeCAP is a time-series processing framework that creatively employs Large Language Models (LLMs) as contextualizers of time series data.
TimeCAP incorporates two independent LLM agents: one generates a textual summary capturing the context of the time series, while the other uses this enriched summary to make more informed predictions.
Experimental results on real-world datasets demonstrate that TimeCAP outperforms state-of-the-art methods for time series event prediction.
arXiv Detail & Related papers (2025-02-17T04:17:27Z) - Language in the Flow of Time: Time-Series-Paired Texts Weaved into a Unified Temporal Narrative [65.84249211767921]
Texts as Time Series (TaTS) considers the time-series-paired texts to be auxiliary variables of the time series.
TaTS can be plugged into any existing numerical-only time series models and enable them to handle time series data with paired texts effectively.
arXiv Detail & Related papers (2025-02-13T03:43:27Z) - Time Series Language Model for Descriptive Caption Generation [11.796431549951055]
We introduce TSLM, a novel time series language model designed specifically for time series captioning.
TSLM operates as an encoder-decoder model, leveraging both text prompts and time series data representations.
We show that TSLM outperforms existing state-of-the-art approaches from multiple data modalities by a significant margin.
arXiv Detail & Related papers (2025-01-03T14:34:30Z) - Text2Freq: Learning Series Patterns from Text via Frequency Domain [8.922661807801227]
Text2Freq is a cross-modality model that integrates text and time series data via the frequency domain.
Our experiments on paired datasets of real-world stock prices and synthetic texts show that Text2Freq achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-11-01T16:11:02Z) - Metadata Matters for Time Series: Informative Forecasting with Transformers [70.38241681764738]
We propose a Metadata-informed Time Series Transformer (MetaTST) for time series forecasting.
To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates.
A Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information.
arXiv Detail & Related papers (2024-10-04T11:37:55Z) - Dataset Condensation for Time Series Classification via Dual Domain Matching [12.317728375957717]
We propose a novel framework named dataset textittextbfCondensation for textittextbfTime textittextbfSeries textittextbfClassification via Dual Domain Matching.
Our proposed framework aims to generate a condensed dataset that matches the surrogate objectives in both the time and frequency domains.
arXiv Detail & Related papers (2024-03-12T02:05:06Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - Data-to-text Generation with Variational Sequential Planning [74.3955521225497]
We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input.
We propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way.
We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation.
arXiv Detail & Related papers (2022-02-28T13:17:59Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Human-like Time Series Summaries via Trend Utility Estimation [13.560018516096754]
We propose a model to create human-like text descriptions for time series.
Our system finds patterns in time series data and ranks these patterns based on empirical observations of human behavior.
The output of our system is a natural language description of time series that attempts to match a human's summary of the same data.
arXiv Detail & Related papers (2020-01-16T06:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.