Overview of the TREC 2024 NeuCLIR Track
- URL: http://arxiv.org/abs/2509.14355v1
- Date: Wed, 17 Sep 2025 18:36:38 GMT
- Title: Overview of the TREC 2024 NeuCLIR Track
- Authors: Dawn Lawrie, Sean MacAvaney, James Mayfield, Paul McNamee, Douglas W. Oard, Luca Soldaini, Eugene Yang,
- Abstract summary: The principal goal of the TREC Neural Cross-Language Information Retrieval (NeuCLIR) track is to study the effect of neural approaches on cross-language information access.<n>NeuCLIR includes four task types: Cross-Language Information Retrieval (CLIR) from news, Multilingual Information Retrieval (MLIR) from news, Report Generation from news, and CLIR from technical documents.
- Score: 43.84164712459855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The principal goal of the TREC Neural Cross-Language Information Retrieval (NeuCLIR) track is to study the effect of neural approaches on cross-language information access. The track has created test collections containing Chinese, Persian, and Russian news stories and Chinese academic abstracts. NeuCLIR includes four task types: Cross-Language Information Retrieval (CLIR) from news, Multilingual Information Retrieval (MLIR) from news, Report Generation from news, and CLIR from technical documents. A total of 274 runs were submitted by five participating teams (and as baselines by the track coordinators) for eight tasks across these four task types. Task descriptions and the available results are presented.
Related papers
- Overview of the TREC 2025 RAGTIME Track [48.045049884733196]
RAGTIME includes three task types: Multilingual Report Generation, English Report Generation, and Multilingual Information Retrieval (MLIR)<n>A total of 125 runs were submitted by 13 participating teams (and as baselines by the track coordinators) for three tasks.<n>This overview describes these three tasks and presents the available results.
arXiv Detail & Related papers (2026-02-10T17:47:20Z) - NeuCLIRTech: Chinese Monolingual and Cross-Language Information Retrieval Evaluation in a Challenging Domain [49.3943974580576]
This paper presents NeuCLIRTech, an evaluation collection for cross-language retrieval over technical information.<n>The collection consists of technical documents written in Chinese and those same documents machine translated into English.<n>The collection supports two retrieval scenarios: monolingual retrieval in Chinese, and cross-language retrieval with English as the query language.
arXiv Detail & Related papers (2026-02-05T05:57:55Z) - Simple Yet Effective Neural Ranking and Reranking Baselines for
Cross-Lingual Information Retrieval [50.882816288076725]
Cross-lingual information retrieval is the task of searching documents in one language with queries in another.
We provide a conceptual framework for organizing different approaches to cross-lingual retrieval using multi-stage architectures for mono-lingual retrieval as a scaffold.
We implement simple yet effective reproducible baselines in the Anserini and Pyserini IR toolkits for test collections from the TREC 2022 NeuCLIR Track, in Persian, Russian, and Chinese.
arXiv Detail & Related papers (2023-04-03T14:17:00Z) - I4U System Description for NIST SRE'20 CTS Challenge [87.17861348484455]
This manuscript describes the I4U submission to the 2020 NIST Speaker Recognition Evaluation (SRE'20) Conversational Telephone Speech (CTS) Challenge.
The I4U's submission was resulted from active collaboration among eight research teams.
The submission was based on the fusion of top performing sub-systems and sub-fusion systems contributed by individual teams.
arXiv Detail & Related papers (2022-11-02T13:04:27Z) - UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu [62.6928395368204]
This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language.
The goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing.
The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business.
arXiv Detail & Related papers (2022-07-25T03:46:51Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2020 [62.6928395368204]
Task was posed as a binary classification task, in which the goal is to differentiate between real and fake news.
We provided a dataset divided into 900 annotated news articles for training and 400 news articles for testing.
42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task.
arXiv Detail & Related papers (2022-07-25T03:41:32Z) - Overview of the Shared Task on Fake News Detection in Urdu at FIRE 2021 [55.41644538483948]
The goal of the shared task is to motivate the community to come up with efficient methods for solving this vital problem.
The training set contains 1300 annotated news articles -- 750 real news, 550 fake news, while the testing set contains 300 news articles -- 200 real, 100 fake news.
The best performing system obtained an F1-macro score of 0.679, which is lower than the past year's best result of 0.907 F1-macro.
arXiv Detail & Related papers (2022-07-11T18:58:36Z) - HC4: A New Suite of Test Collections for Ad Hoc CLIR [3.816529552690824]
HC4 is a new suite of test collections for ad hoc Cross-Language Information Retrieval.
The HC4 collections contain 60 topics and about half a million documents for each of Chinese and Persian, and 54 topics and five million documents for Russian.
Documents were judged on a three-grade relevance scale.
arXiv Detail & Related papers (2022-01-24T22:52:11Z) - KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text Classification for
Kinyarwanda and Kirundi [18.01565807026177]
We introduce two news datasets for classification of news articles in Kinyarwanda and Kirundi, two low-resource African languages.
We provide statistics, guidelines for preprocessing, and monolingual and cross-lingual baseline models.
Our experiments show that training embeddings on the relatively higher-resourced Kinyarwanda yields successful cross-lingual transfer to Kirundi.
arXiv Detail & Related papers (2020-10-23T05:37:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.