TComQA: Extracting Temporal Commonsense from Text
- URL: http://arxiv.org/abs/2508.15274v1
- Date: Thu, 21 Aug 2025 06:07:40 GMT
- Title: TComQA: Extracting Temporal Commonsense from Text
- Authors: Lekshmi R Nair, Arun Sankar, Koninika Pal,
- Abstract summary: Large language models (LLMs) struggle in generating text that require reasoning with temporal commonsense due to infrequent explicit mention in text.<n>We propose a temporal commonsense extraction pipeline that leverages LLMs to automatically mine temporal commonsense and use it to construct TComQA.<n>TComQA has been validated through crowdsourcing and achieves over 80% precision in extracting temporal commonsense.
- Score: 0.9339914898177187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding events necessitates grasping their temporal context, which is often not explicitly stated in natural language. For example, it is not a trivial task for a machine to infer that a museum tour may last for a few hours, but can not take months. Recent studies indicate that even advanced large language models (LLMs) struggle in generating text that require reasoning with temporal commonsense due to its infrequent explicit mention in text. Therefore, automatically mining temporal commonsense for events enables the creation of robust language models. In this work, we investigate the capacity of LLMs to extract temporal commonsense from text and evaluate multiple experimental setups to assess their effectiveness. Here, we propose a temporal commonsense extraction pipeline that leverages LLMs to automatically mine temporal commonsense and use it to construct TComQA, a dataset derived from SAMSum and RealNews corpora. TComQA has been validated through crowdsourcing and achieves over 80\% precision in extracting temporal commonsense. The model trained with TComQA also outperforms an LLM fine-tuned on existing dataset of temporal question answering task.
Related papers
- TimeSense:Making Large Language Models Proficient in Time-Series Analysis [26.44226032396234]
In the time-series domain, an increasing number of works combine text with temporal data to leverage the reasoning capabilities of large language models.<n>We propose TimeSense, a framework that makes LLMs proficient in time-series analysis by balancing textual reasoning with a preserved temporal sense.<n>TimeSense achieves state-of-the-art performance across multiple tasks, and it particularly outperforms existing methods on complex multi-dimensional time-series reasoning tasks.
arXiv Detail & Related papers (2025-11-09T12:00:18Z) - Who Gets Cited Most? Benchmarking Long-Context Language Models on Scientific Articles [81.89404347890662]
SciTrek is a novel question-answering benchmark designed to evaluate the long-context reasoning capabilities of large language models (LLMs) using scientific articles.<n>Our analysis reveals systematic shortcomings in models' abilities to perform basic numerical operations and accurately locate specific information in long contexts.
arXiv Detail & Related papers (2025-09-25T11:36:09Z) - A Semantic Parsing Framework for End-to-End Time Normalization [10.472379345636845]
Time normalization is the task of converting natural language temporal expressions into machine-readable representations.<n>Traditional systems based on the ISO-TimeML schema limit expressivity.<n>We introduce a novel formulation of time normalization as a code generation task grounded in the SCATE framework.
arXiv Detail & Related papers (2025-07-08T23:30:11Z) - Chat-TS: Enhancing Multi-Modal Reasoning Over Time-Series and Natural Language Data [22.274663165215237]
Time-series analysis is critical for a wide range of fields such as healthcare, finance, transportation, and energy.<n>Current time-series models are limited in their ability to perform reasoning that involves both time-series and their textual content.<n>Chat-TS integrates time-series tokens into LLMs' vocabulary, enhancing its reasoning ability over both modalities.
arXiv Detail & Related papers (2025-03-13T21:05:11Z) - Time2Lang: Bridging Time-Series Foundation Models and Large Language Models for Health Sensing Beyond Prompting [3.2688127177376227]
Large language models (LLMs) show promise for health applications when combined with behavioral sensing data.<n>Traditional approaches convert sensor data into text prompts, but this process is prone to errors, computationally expensive, and requires domain expertise.<n>Here, we present Time2Lang, a framework that directly maps TFM outputs to LLM representations without intermediate text conversion.
arXiv Detail & Related papers (2025-02-11T14:58:54Z) - Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding [57.62275091656578]
We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE)
This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event chain within TCE.
arXiv Detail & Related papers (2024-06-04T16:42:17Z) - Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction [19.96263282146533]
In this paper, we specifically address the extraction of temporal facts from natural language text.
We propose a timeline-based sentence decomposition strategy using large language models (LLMs) with in-context learning.
Our experiments show that TSDRE achieves state-of-the-art results on both HyperRED-Temporal and ComplexTRED datasets.
arXiv Detail & Related papers (2024-05-16T17:48:21Z) - Large Language Models Can Learn Temporal Reasoning [11.599570446840547]
We propose TG-LLM, a novel framework towards language-based temporal reasoning.
Instead of reasoning over the original context, we adopt a latent representation, temporal graph (TG)
A synthetic dataset (TGQA) is fully controllable and requires minimal supervision.
arXiv Detail & Related papers (2024-01-12T19:00:26Z) - Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information.
We propose an approach to distill the generated information during fine-tuning of self-supervised speech models.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z) - Temporal Common Sense Acquisition with Minimal Supervision [77.8308414884754]
This work proposes a novel sequence modeling approach that exploits explicit and implicit mentions of temporal common sense.
Our method is shown to give quality predictions of various dimensions of temporal common sense.
It also produces representations of events for relevant tasks such as duration comparison, parent-child relations, event coreference and temporal QA.
arXiv Detail & Related papers (2020-05-08T22:20:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.