Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding
- URL: http://arxiv.org/abs/2406.02472v1
- Date: Tue, 4 Jun 2024 16:42:17 GMT
- Title: Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding
- Authors: Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua,
- Abstract summary: We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE)
This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event chain within TCE.
- Score: 57.62275091656578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The digital landscape is rapidly evolving with an ever-increasing volume of online news, emphasizing the need for swift and precise analysis of complex events. We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE). This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event chain within TCE, characterized by their key points and timestamps. We establish a benchmark, named TCELongBench, to evaluate the proficiency of LLMs in handling temporal dynamics and understanding extensive text. This benchmark encompasses three distinct tasks - reading comprehension, temporal sequencing, and future event forecasting. In the experiment, we leverage retrieval-augmented generation (RAG) method and LLMs with long context window to deal with lengthy news articles of TCE. Our findings indicate that models with suitable retrievers exhibit comparable performance with those utilizing long context window.
Related papers
- Efficient Retrieval of Temporal Event Sequences from Textual Descriptions [0.0]
TPP-LLM-Embedding is a unified model for embedding and retrieving event sequences based on natural language descriptions.
Our model encodes both event types and times, generating a sequence-level representation through pooling.
TPP-LLM-Embedding enables efficient retrieval and demonstrates superior performance compared to baseline models across diverse datasets.
arXiv Detail & Related papers (2024-10-17T21:35:55Z) - Grounding Partially-Defined Events in Multimodal Data [61.0063273919745]
We introduce a multimodal formulation for partially-defined events and cast the extraction of these events as a three-stage span retrieval task.
We propose a benchmark for this task, MultiVENT-G, that consists of 14.5 hours of densely annotated current event videos and 1,168 text documents, containing 22.8K labeled event-centric entities.
Results illustrate the challenges that abstract event understanding poses and demonstrates promise in event-centric video-language systems.
arXiv Detail & Related papers (2024-10-07T17:59:48Z) - From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection [16.47323362700347]
We introduce a novel approach to enhance time series forecasting by reasoning across both text and time series data.
With language as a medium, our method adaptively integrates social events into forecasting models, aligning news content with time series fluctuations to provide richer insights.
Specifically, we utilize LLM-based agents to iteratively filter out irrelevant news and employ human-like reasoning to evaluate predictions.
arXiv Detail & Related papers (2024-09-26T03:50:22Z) - Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA [71.04146366608904]
Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows.
We propose a novel long-context benchmark, Loong, aligning with realistic scenarios through extended multi-document question answering (QA)
Loong introduces four types of tasks with a range of context lengths: Spotlight Locating, Comparison, Clustering, and Chain of Reasoning.
arXiv Detail & Related papers (2024-06-25T09:42:56Z) - Large Language Models as Event Forecasters [10.32127659470566]
Key elements of human events are extracted as quadruples that consist of subject, relation, object, and timestamp.
These quadruples or quintuples, when organized within a specific domain, form a temporal knowledge graph (TKG)
arXiv Detail & Related papers (2024-06-15T04:09:31Z) - Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model [22.07414287186125]
Quest is a query-centric data method aggregating semantically relevant yet diverse documents.
It uses a generative model to predict potential queries for each document, grouping documents with similar queries and keywords.
Experiments demonstrate Quest's superior performance on long-context tasks, achieving remarkable results with context lengths of up to 1M tokens.
arXiv Detail & Related papers (2024-05-30T08:50:55Z) - XL$^2$Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies [45.31042312867939]
Large Language Models (LLMs) have demonstrated remarkable performance across diverse tasks but are constrained by their small context window sizes.
Various efforts have been proposed to expand the context window to accommodate even up to 200K input tokens.
We introduce a benchmark for extremely long context understanding with long-range dependencies, XL$2$Bench.
arXiv Detail & Related papers (2024-04-08T12:29:07Z) - Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues.
We equip each agent with the capability of sharing and reacting to images.
The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z) - Tracking Objects and Activities with Attention for Temporal Sentence
Grounding [51.416914256782505]
Temporal sentence (TSG) aims to localize the temporal segment which is semantically aligned with a natural language query in an untrimmed segment.
We propose a novel Temporal Sentence Tracking Network (TSTNet), which contains (A) a Cross-modal Targets Generator to generate multi-modal and search space, and (B) a Temporal Sentence Tracker to track multi-modal targets' behavior and to predict query-related segment.
arXiv Detail & Related papers (2023-02-21T16:42:52Z) - Temporal Common Sense Acquisition with Minimal Supervision [77.8308414884754]
This work proposes a novel sequence modeling approach that exploits explicit and implicit mentions of temporal common sense.
Our method is shown to give quality predictions of various dimensions of temporal common sense.
It also produces representations of events for relevant tasks such as duration comparison, parent-child relations, event coreference and temporal QA.
arXiv Detail & Related papers (2020-05-08T22:20:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.