Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data
- URL: http://arxiv.org/abs/2409.16647v1
- Date: Wed, 25 Sep 2024 06:04:03 GMT
- Title: Domain-Independent Automatic Generation of Descriptive Texts for Time-Series Data
- Authors: Kota Dohi, Aoi Ito, Harsh Purohit, Tomoya Nishida, Takashi Endo, Yohei Kawaguchi,
- Abstract summary: We propose a method to generate domain-independent descriptive texts from time-series data.
By implementing the novel backward approach, we create the Temporal Automated Captions for Observations dataset.
Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.
- Score: 5.264562311559751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to scarcity of time-series data annotated with descriptive texts, training a model to generate descriptive texts for time-series data is challenging. In this study, we propose a method to systematically generate domain-independent descriptive texts from time-series data. We identify two distinct approaches for creating pairs of time-series data and descriptive texts: the forward approach and the backward approach. By implementing the novel backward approach, we create the Temporal Automated Captions for Observations (TACO) dataset. Experimental results demonstrate that a contrastive learning based model trained using the TACO dataset is capable of generating descriptive texts for time-series data in novel domains.
Related papers
- Text2Freq: Learning Series Patterns from Text via Frequency Domain [8.922661807801227]
Text2Freq is a cross-modality model that integrates text and time series data via the frequency domain.
Our experiments on paired datasets of real-world stock prices and synthetic texts show that Text2Freq achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-11-01T16:11:02Z) - Metadata Matters for Time Series: Informative Forecasting with Transformers [70.38241681764738]
We propose a Metadata-informed Time Series Transformer (MetaTST) for time series forecasting.
To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates.
A Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information.
arXiv Detail & Related papers (2024-10-04T11:37:55Z) - Dataset Condensation for Time Series Classification via Dual Domain Matching [12.317728375957717]
We propose a novel framework named dataset textittextbfCondensation for textittextbfTime textittextbfSeries textittextbfClassification via Dual Domain Matching.
Our proposed framework aims to generate a condensed dataset that matches the surrogate objectives in both the time and frequency domains.
arXiv Detail & Related papers (2024-03-12T02:05:06Z) - Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain [53.22419717434372]
We propose a new task, namely stylized data-to-text generation, whose aim is to generate coherent text according to a specific style.
This task is non-trivial, due to three challenges: the logic of the generated text, unstructured style reference, and biased training samples.
We propose a novel stylized data-to-text generation model, named StyleD2T, comprising three components: logic planning-enhanced data embedding, mask-based style embedding, and unbiased stylized text generation.
arXiv Detail & Related papers (2023-05-05T03:02:41Z) - Time-aware Prompting for Text Generation [17.58231642569116]
We study the effects of incorporating timestamps, such as document creation dates, into generation systems.
Two types of time-aware prompts are investigated: (1) textual prompts that encode document timestamps in natural language sentences; and (2) linear prompts that convert timestamps into continuous vectors.
arXiv Detail & Related papers (2022-11-03T22:10:25Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - Data-to-text Generation with Variational Sequential Planning [74.3955521225497]
We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input.
We propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way.
We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation.
arXiv Detail & Related papers (2022-02-28T13:17:59Z) - Data-to-Text Generation with Iterative Text Editing [3.42658286826597]
We present a novel approach to data-to-text generation based on iterative text editing.
We first transform data items to text using trivial templates, and then we iteratively improve the resulting text by a neural model trained for the sentence fusion task.
The output of the model is filtered by a simple and reranked with an off-the-shelf pre-trained language model.
arXiv Detail & Related papers (2020-11-03T13:32:38Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z) - Human-like Time Series Summaries via Trend Utility Estimation [13.560018516096754]
We propose a model to create human-like text descriptions for time series.
Our system finds patterns in time series data and ranks these patterns based on empirical observations of human behavior.
The output of our system is a natural language description of time series that attempts to match a human's summary of the same data.
arXiv Detail & Related papers (2020-01-16T06:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.