Formulation Comparison for Timeline Construction using LLMs
- URL: http://arxiv.org/abs/2403.00990v1
- Date: Fri, 1 Mar 2024 21:24:24 GMT
- Title: Formulation Comparison for Timeline Construction using LLMs
- Authors: Kimihiro Hasegawa, Nikhil Kandukuri, Susan Holm, Yukari Yamakawa,
Teruko Mitamura
- Abstract summary: We develop a new evaluation dataset, TimeSET, consisting of single-document timelines with document-level order annotation.
TimeSET features saliency-based event selection and partial ordering, which enable a practical annotation workload.
Our experiments show that (1) NLI formulation with Flan-T5 demonstrates a strong performance among others, while (2) timeline construction and event temporal ordering are still challenging tasks for few-shot LLMs.
- Score: 6.827174240679527
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Constructing a timeline requires identifying the chronological order of
events in an article. In prior timeline construction datasets, temporal orders
are typically annotated by either event-to-time anchoring or event-to-event
pairwise ordering, both of which suffer from missing temporal information. To
mitigate the issue, we develop a new evaluation dataset, TimeSET, consisting of
single-document timelines with document-level order annotation. TimeSET
features saliency-based event selection and partial ordering, which enable a
practical annotation workload. Aiming to build better automatic timeline
construction systems, we propose a novel evaluation framework to compare
multiple task formulations with TimeSET by prompting open LLMs, i.e., Llama 2
and Flan-T5. Considering that identifying temporal orders of events is a core
subtask in timeline construction, we further benchmark open LLMs on existing
event temporal ordering datasets to gain a robust understanding of their
capabilities. Our experiments show that (1) NLI formulation with Flan-T5
demonstrates a strong performance among others, while (2) timeline construction
and event temporal ordering are still challenging tasks for few-shot LLMs. Our
code and data are available at https://github.com/kimihiroh/timeset.
Related papers
- TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents [52.13094810313054]
TimeCAP is a time-series processing framework that creatively employs Large Language Models (LLMs) as contextualizers of time series data.
TimeCAP incorporates two independent LLM agents: one generates a textual summary capturing the context of the time series, while the other uses this enriched summary to make more informed predictions.
Experimental results on real-world datasets demonstrate that TimeCAP outperforms state-of-the-art methods for time series event prediction.
arXiv Detail & Related papers (2025-02-17T04:17:27Z) - TempoGPT: Enhancing Temporal Reasoning via Quantizing Embedding [13.996105878417204]
We propose a multi-modal time series data construction approach and a multi-modal time series language model (TLM), TempoGPT.
We construct multi-modal data for complex reasoning tasks by analyzing the variable-system relationships within a white-box system.
Extensive experiments demonstrate that TempoGPT accurately perceives temporal information, logically infers conclusions, and achieves state-of-the-art in the constructed complex time series reasoning tasks.
arXiv Detail & Related papers (2025-01-13T13:47:05Z) - Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization [93.56166917491487]
This paper proposes CHRONOS - Causal Headline Retrieval for Open-domain News Timeline SummarizatiOn via Iterative Self-Questioning.
Our experiments indicate that CHRONOS is not only adept at open-domain timeline summarization, but it also rivals the performance of existing state-of-the-art systems designed for closed-domain applications.
arXiv Detail & Related papers (2025-01-01T16:28:21Z) - Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance [22.53244715723573]
We introduce a novel task, called Constrained Timeline Summarization (CTLS), where a timeline is generated in which all events in the timeline meet some constraint.
We propose an approach that employs a large language model (LLM) to summarize news articles according to a specified constraint and cluster them to identify key events to include in a constrained timeline.
arXiv Detail & Related papers (2024-12-23T09:17:06Z) - Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification [4.5939667818289385]
HiTime is a hierarchical multi-modal model that seamlessly integrates temporal information into large language models.
Our findings highlight the potential of integrating temporal features into LLMs, paving the way for advanced time series analysis.
arXiv Detail & Related papers (2024-10-24T12:32:19Z) - Temporally Grounding Instructional Diagrams in Unconstrained Videos [51.85805768507356]
We study the challenging problem of simultaneously localizing a sequence of queries in instructional diagrams in a video.
Most existing methods focus on grounding one query at a time, ignoring the inherent structures among queries.
We propose composite queries constructed by exhaustively pairing up the visual content features of the step diagrams.
We demonstrate the effectiveness of our approach on the IAW dataset for grounding step diagrams and the YouCook2 benchmark for grounding natural language queries.
arXiv Detail & Related papers (2024-07-16T05:44:30Z) - UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization [34.257914212541394]
This paper introduces UnSeenTimeQA, a novel data contamination-free time-sensitive question-answering benchmark.
It differs from existing TSQA benchmarks by avoiding web-searchable queries grounded in the real-world.
We present a series of time-sensitive event scenarios based on synthetically generated facts.
arXiv Detail & Related papers (2024-07-03T22:02:07Z) - Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? [70.19200858203388]
Temporal reasoning is fundamental for large language models to comprehend the world.
CoTempQA is a benchmark containing four co-temporal scenarios.
Our experiments reveal a significant gap between the performance of current LLMs and human-level reasoning.
arXiv Detail & Related papers (2024-06-13T12:56:21Z) - TLEX: An Efficient Method for Extracting Exact Timelines from TimeML Temporal Graphs [3.06868287890455]
We develop an exact, end-to-end solution which we call TLEX (TimeLine EXtraction) for extracting timelines from TimeML annotated texts.
TLEX transforms TimeML annotations into a collection of timelines arranged in a trunk-and-branch structure.
We show that 123 of the texts are inconsistent, 181 of them have more than one real world'' or main timeline, and there are 2,541 indeterminate sections across all four corpora.
arXiv Detail & Related papers (2024-06-07T21:20:32Z) - Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding [57.62275091656578]
We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE)
This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event chain within TCE.
arXiv Detail & Related papers (2024-06-04T16:42:17Z) - Follow the Timeline! Generating Abstractive and Extractive Timeline
Summary in Chronological Order [78.46986998674181]
We propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order.
We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset.
UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
arXiv Detail & Related papers (2023-01-02T20:29:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.