Related papers: Re3: Learning to Balance Relevance & Recency for Temporal Information Retrieval

Re3: Learning to Balance Relevance & Recency for Temporal Information Retrieval

URL: http://arxiv.org/abs/2509.01306v1
Date: Mon, 01 Sep 2025 09:44:01 GMT
Title: Re3: Learning to Balance Relevance & Recency for Temporal Information Retrieval
Authors: Jiawei Cao, Jie Ouyang, Zhaomeng Zhou, Mingyue Cheng, Yupeng Li, Jiaxian Yan, Qi Liu,
Abstract summary: Temporal Information Retrieval is a critical yet unresolved task for modern search systems.<n>Re3 is a framework that balances semantic and temporal information through a query-aware gating mechanism.<n>On Re2Bench, Re3 achieves state-of-the-art results, leading in R@1 across all three subsets.
Score: 10.939002113975706
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Temporal Information Retrieval (TIR) is a critical yet unresolved task for modern search systems, retrieving documents that not only satisfy a query's information need but also adhere to its temporal constraints. This task is shaped by two challenges: Relevance, ensuring alignment with the query's explicit temporal requirements, and Recency, selecting the freshest document among multiple versions. Existing methods often address the two challenges in isolation, relying on brittle heuristics that fail in scenarios where temporal requirements and staleness resistance are intertwined. To address this gap, we introduce Re2Bench, a benchmark specifically designed to disentangle and evaluate Relevance, Recency, and their hybrid combination. Building on this foundation, we propose Re3, a unified and lightweight framework that dynamically balances semantic and temporal information through a query-aware gating mechanism. On Re2Bench, Re3 achieves state-of-the-art results, leading in R@1 across all three subsets. Ablation studies with backbone sensitivity tests confirm robustness, showing strong generalization across diverse encoders and real-world settings. This work provides both a generalizable solution and a principled evaluation suite, advancing the development of temporally aware retrieval systems. Re3 and Re2Bench are available online: https://anonymous.4open.science/r/Re3-0C5A

Related papers

Unified Interactive Multimodal Moment Retrieval via Cascaded Embedding-Reranking and Temporal-Aware Score Fusion [0.0]
We propose a unified multimodal moment retrieval system with three key innovations.<n>First, a cascaded dual-embedding pipeline combines BEIT-3 and SigLIP for broad retrieval.<n>Second, a temporal-aware scoring mechanism applies exponential decay penalties to large temporal gaps via beam search.<n>Third, Agent-guided query decomposition (GPT-4o) automatically interprets ambiguous queries.
arXiv Detail & Related papers (2025-12-15T02:50:43Z)
LiveSearchBench: An Automatically Constructed Benchmark for Retrieval and Reasoning over Dynamic Knowledge [31.40589987269264]
We present LiveSearchBench, an automated pipeline for constructing retrieval-dependent benchmarks from recent knowledge updates.<n>Our method computes deltas between successive Wikidata snapshots, filters candidate triples for quality, and synthesizes natural-language questions at three levels of reasoning difficulty.<n> Experiments show a pronounced performance drop when models confront facts that post-date pretraining, with the gap most salient on multi-hop queries.
arXiv Detail & Related papers (2025-11-03T10:00:49Z)
FlashResearch: Real-time Agent Orchestration for Efficient Deep Research [62.03819662340356]
FlashResearch is a novel framework for efficient deep research.<n>It transforms sequential processing into parallel, runtime orchestration.<n>It can deliver up to a 5x speedup while maintaining comparable quality.
arXiv Detail & Related papers (2025-10-02T00:15:39Z)
MR$^2$-Bench: Going Beyond Matching to Reasoning in Multimodal Retrieval [86.35779264575154]
Multimodal retrieval is becoming a crucial component of modern AI applications, yet its evaluation lags behind the demands of more realistic and challenging scenarios.<n>We introduce MR$2$-Bench, a reasoning-intensive benchmark for multimodal retrieval.
arXiv Detail & Related papers (2025-09-30T15:09:14Z)
Reasoning-enhanced Query Understanding through Decomposition and Interpretation [87.56450566014625]
ReDI is a Reasoning-enhanced approach for query understanding through Decomposition and Interpretation.<n>We compiled a large-scale dataset of real-world complex queries from a major search engine.<n> Experiments on BRIGHT and BEIR demonstrate that ReDI consistently surpasses strong baselines in both sparse and dense retrieval paradigms.
arXiv Detail & Related papers (2025-09-08T10:58:42Z)
Reading Between the Timelines: RAG for Answering Diachronic Questions [8.969698902720799]
We propose a new framework that fundamentally redesigns the RAG pipeline to infuse temporal logic.<n>Our approach yields substantial gains in answer accuracy, surpassing standard RAG implementations by 13% to 27%.<n>This work provides a validated pathway toward RAG systems capable of performing the nuanced, evolutionary analysis required for complex, real-world questions.
arXiv Detail & Related papers (2025-07-21T05:19:41Z)
Temporal Information Retrieval via Time-Specifier Model Merging [9.690250070561461]
Time-Specifier Model Merging (TSM) is a novel method that enhances temporal retrieval while preserving accuracy on non-temporal queries.<n>Extensive experiments on both temporal and non-temporal datasets demonstrate that TSM significantly improves performance on temporally constrained queries.
arXiv Detail & Related papers (2025-07-09T12:16:11Z)
Respecting Temporal-Causal Consistency: Entity-Event Knowledge Graphs for Retrieval-Augmented Generation [69.45495166424642]
We develop a robust and discriminative QA benchmark to measure temporal, causal, and character consistency understanding in narrative documents.<n>We then introduce Entity-Event RAG (E2RAG), a dual-graph framework that keeps separate entity and event subgraphs linked by a bipartite mapping.<n>Across ChronoQA, our approach outperforms state-of-the-art unstructured and KG-based RAG baselines, with notable gains on causal and character consistency queries.
arXiv Detail & Related papers (2025-06-06T10:07:21Z)
RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems [35.47591417637136]
Retrieval-Augmented Generation (RAG) enhances recency and factuality in answers.<n>Existing evaluations rarely test how well these systems cope with real-world noise, conflicting between internal and external retrieved contexts, or fast-changing facts.<n>We introduce Retrieval-Aware Robustness Evaluation (RARE), a unified framework and large-scale benchmark that jointly stress-test query and document perturbations over dynamic, time-sensitive corpora.
arXiv Detail & Related papers (2025-06-01T02:42:36Z)
MultiConIR: Towards multi-condition Information Retrieval [57.6405602406446]
We introduce MultiConIR, the first benchmark designed to evaluate retrieval models in multi-condition scenarios.<n>We propose three tasks to assess retrieval and reranking models on multi-condition robustness, monotonic relevance ranking, and query format sensitivity.
arXiv Detail & Related papers (2025-03-11T05:02:03Z)
MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering [3.117448929160824]
temporal relations and answering time-sensitive questions is a challenging task for question-answering systems powered by large language models (LLMs)<n>We introduce the TempRAGEval benchmark, which repurposes existing datasets by incorporating temporal perturbations and gold evidence labels.<n>On TempRAGEval, MRAG significantly outperforms baseline retrievers in retrieval performance, leading to further improvements in final answer accuracy.
arXiv Detail & Related papers (2024-12-20T03:58:27Z)
Unified Active Retrieval for Retrieval Augmented Generation [69.63003043712696]
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal. Existing active retrieval methods face two challenges: 1. They usually rely on a single criterion, which struggles with handling various types of instructions. They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated.
arXiv Detail & Related papers (2024-06-18T12:09:02Z)
Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation [103.90033029330527]
Few-Shot Instance (FSIS) requires detecting and segmenting novel classes with limited support examples. We introduce a unified framework, Reference Twice (RefT), to exploit the relationship between support and query features for FSIS.
arXiv Detail & Related papers (2023-01-03T15:33:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.