Related papers: LongEval at CLEF 2025: Longitudinal Evaluation of IR Systems on Web and Scientific Data

LongEval at CLEF 2025: Longitudinal Evaluation of IR Systems on Web and Scientific Data

URL: http://arxiv.org/abs/2509.17469v1
Date: Mon, 22 Sep 2025 08:05:40 GMT
Title: LongEval at CLEF 2025: Longitudinal Evaluation of IR Systems on Web and Scientific Data
Authors: Matteo Cancellieri, Alaa El-Ebshihy, Tobias Fink, Maik Fröbe, Petra Galuščáková, Gabriela Gonzalez-Saez, Lorraine Goeuriot, David Iommi, Jüri Keller, Petr Knoth, Philippe Mulhem, Florina Piroi, David Pride, Philipp Schaer,
Abstract summary: LongEval lab focuses on the evaluation of information retrieval systems over time.<n>Two datasets are provided that capture evolving search scenarios with changing documents, queries, and relevance assessments.<n>We present an overview of this year's tasks and datasets, as well as the participating systems.
Score: 10.309769289748273
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The LongEval lab focuses on the evaluation of information retrieval systems over time. Two datasets are provided that capture evolving search scenarios with changing documents, queries, and relevance assessments. Systems are assessed from a temporal perspective-that is, evaluating retrieval effectiveness as the data they operate on changes. In its third edition, LongEval featured two retrieval tasks: one in the area of ad-hoc web retrieval, and another focusing on scientific article retrieval. We present an overview of this year's tasks and datasets, as well as the participating systems. A total of 19 teams submitted their approaches, which we evaluated using nDCG and a variety of measures that quantify changes in retrieval effectiveness over time.

Related papers

Towards Anytime Retrieval: A Benchmark for Anytime Person Re-Identification [85.78039373517021]
Anytime Person Re-identification (AT-ReID) aims to achieve effective retrieval in multiple scenarios based on variations in time.<n>We collect the first large-scale dataset, AT-USTC, which contains 403k images of individuals wearing multiple clothes.<n>We propose a unified model named Uni-AT, which comprises a multi-scenario ReID framework for scenario-specific features learning.
arXiv Detail & Related papers (2025-09-20T11:20:22Z)
DS@GT at LongEval: Evaluating Temporal Performance in Web Search Systems and Topics with Two-Stage Retrieval [44.99833362998488]
The DS@GT competition team participated in the Longitudinal Evaluation of Model Performance (LongEval) lab at CLEF 2025.<n>Our analysis of the Qwant web dataset includes exploratory data analysis with topic modeling over time.<n>Our best system achieves an average NDCG@10 of 0.296 across the entire training and test dataset, with an overall best score of 0.395 on 2023-05.
arXiv Detail & Related papers (2025-07-11T07:23:08Z)
LongEval at CLEF 2025: Longitudinal Evaluation of IR Model Performance [5.4043491660907135]
LongEval Lab continues to explore the challenges of temporal persistence in Information Retrieval (IR)<n>By evaluating how model performance degrades as test data diverge temporally from training data, LongEval seeks to advance the understanding of temporal dynamics in IR systems.<n>The 2025 edition aims to engage the IR and NLP communities in addressing the development of adaptive models that can maintain retrieval quality over time in the domains of web search and scientific retrieval.
arXiv Detail & Related papers (2025-03-11T15:29:41Z)
Replicability Measures for Longitudinal Information Retrieval Evaluation [3.4917392789760147]
This work explores how the effectiveness measured in evolving experiments can be assessed. The persistency of effectiveness is investigated as a replicability task. It was found that the most effective systems are not necessarily the ones with the most persistent performance.
arXiv Detail & Related papers (2024-09-09T08:19:43Z)
Evaluation of Temporal Change in IR Test Collections [3.4917392789760147]
This work investigates how the temporal generalizability of effectiveness evaluations can be assessed. We show that the proposed measures can be well adapted to describe the changes in the retrieval results.
arXiv Detail & Related papers (2024-07-01T15:25:31Z)
A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z)
Exploring the Practicality of Generative Retrieval on Dynamic Corpora [41.223804434693875]
In this paper, we focus on Generative Retrievals (GR), which apply autoregressive language models to IR problems. Our results on the StreamingQA benchmark demonstrate that GR is more adaptable to evolving knowledge (4-11%), robust in learning knowledge with temporal information, and efficient in terms of FLOPs (x6), indexing time (x6), and storage footprint (x4) Our paper highlights the potential of GR for future use in practical IR systems within dynamic environments.
arXiv Detail & Related papers (2023-05-27T16:05:00Z)
Scaling up Search Engine Audits: Practical Insights for Algorithm Auditing [68.8204255655161]
We set up experiments for eight search engines with hundreds of virtual agents placed in different regions. We demonstrate the successful performance of our research infrastructure across multiple data collections. We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time.
arXiv Detail & Related papers (2021-06-10T15:49:58Z)
Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers. This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)
Deep Learning for Person Re-identification: A Survey and Outlook [233.36948173686602]
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings.
arXiv Detail & Related papers (2020-01-13T12:49:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.