Related papers: Let the Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering

Let the Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering

URL: http://arxiv.org/abs/2603.01853v1
Date: Mon, 02 Mar 2026 13:33:39 GMT
Title: Let the Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering
Authors: Xufei Lv, Jiahui Yang, Yifu Gao, Linbo Qiao, Houde Liu,
Abstract summary: Temporal Knowledge Graph Question Answering (TKGQA) demands multi-hop reasoning under temporal constraints.<n>We show that granting an off-the-shelf autonomy, that is, letting it decide what to do next, already yields substantial gains.<n>We propose AT2QA, an autonomous, training-free agent for temporal question answering.
Score: 12.204337131764852
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Temporal Knowledge Graph Question Answering (TKGQA) demands multi-hop reasoning under temporal constraints. Prior approaches based on large language models (LLMs) typically rely on rigid, hand-crafted retrieval workflows or costly supervised fine-tuning. We show that simply granting an off-the-shelf LLM autonomy, that is, letting it decide what to do next, already yields substantial gains even in a strict zero-shot setting. Building on this insight, we propose AT2QA, an autonomous, training-free agent for temporal question answering that iteratively interacts with the temporal knowledge graph via a general search tool for dynamic retrieval. Experiments on MultiTQ demonstrate large improvements: AT2QA achieves 88.7% Hits@1 (+10.7% over prior SOTA), including a +20.1% gain on challenging multi-target queries, showing that agentic autonomy can decisively outperform fine-tuning for temporal question answering. Code and the full set of sampled trajectories are available on https://github.com/AT2QA-Official-Code/AT2QA-Official-Code

Related papers

Temp-R1: A Unified Autonomous Agent for Complex Temporal KGQA via Reverse Curriculum Reinforcement Learning [51.79753403262177]
Temporal Knowledge Graph Question Answering (TKGQA) is inherently challenging, as it requires sophisticated reasoning over dynamic facts with multi-hop dependencies and complex temporal constraints.<n>We propose Temp-R1, the first autonomous end-to-end agent for TKGQA trained through reinforcement learning.<n>Our 8B- parameter Temp-R1 achieves state-of-the-art performance on MultiTQ and TimelineKGQA, improving 19.8% over strong baselines on complex questions.
arXiv Detail & Related papers (2026-01-26T09:23:53Z)
The benefits of query-based KGQA systems for complex and temporal questions in LLM era [55.20230501807337]
Large language models excel in question-answering (QA) yet still struggle with multi-hop reasoning and temporal questions.<n> Query-based knowledge graph QA (KGQA) offers a modular alternative by generating executable queries instead of direct answers.<n>We explore multi-stage query-based framework for WikiData QA, proposing multi-stage approach that enhances performance on challenging multi-hop and temporal benchmarks.
arXiv Detail & Related papers (2025-07-16T06:41:03Z)
Question-Aware Gaussian Experts for Audio-Visual Question Answering [8.377705744753047]
Audio-Visual Question Answering (AVQA) requires question-based multimodal reasoning and precise temporal grounding.<n>This paper proposes QA-TIGER, a novel framework that explicitly incorporates question information and models continuous temporal dynamics.
arXiv Detail & Related papers (2025-03-06T14:11:46Z)
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement [55.2439260314328]
Time Series Multi-Task Question Answering (Time-MQA) is a unified framework that enables natural language queries across multiple time series tasks.<n>Central to Time-MQA is the TSQA dataset, a large-scale dataset containing $sim $200k question-answer pairs.
arXiv Detail & Related papers (2025-02-26T13:47:13Z)
TimeLogic: A Temporal Logic Benchmark for Video QA [64.32208175236323]
We introduce the TimeLogic QA (TLQA) framework to automatically generate temporal logical questions.<n>We leverage 4 datasets, STAR, Breakfast, AGQA, and CrossTask, and generate 2k and 10k QA pairs for each category.<n>We assess the VideoQA model's temporal reasoning performance on 16 categories of temporal logic with varying temporal complexity.
arXiv Detail & Related papers (2025-01-13T11:12:59Z)
AutoReason: Automatic Few-Shot Reasoning Decomposition [0.0]
Chain of Thought (CoT) was introduced in recent research as a method for improving step-by-step reasoning in Large Language Models.<n>We propose a system to automatically generate rationales using CoT.<n>Our method improves multi-step implicit reasoning capabilities by decomposing the implicit query into several explicit questions.
arXiv Detail & Related papers (2024-12-09T20:35:39Z)
Self-Improvement Programming for Temporal Knowledge Graph Question Answering [31.33908040172437]
Temporal Knowledge Graph Question Answering (TKGQA) aims to answer questions with temporal intent over Temporal Knowledge Graphs (TKGs) Existing end-to-end methods implicitly model the time constraints by learning time-aware embeddings of questions and candidate answers. We introduce a novel self-improvement Programming method for TKGQA (Prog-TQA)
arXiv Detail & Related papers (2024-04-02T08:14:27Z)
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions [80.60423934589515]
We introduce NExT-QA, a rigorously designed video question answering (VideoQA) benchmark. We set up multi-choice and open-ended QA tasks targeting causal action reasoning, temporal action reasoning, and common scene comprehension. We find that top-performing methods excel at shallow scene descriptions but are weak in causal and temporal action reasoning.
arXiv Detail & Related papers (2021-05-18T04:56:46Z)
Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA) First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA) Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.