MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question
Answering for 16 Diverse Languages
- URL: http://arxiv.org/abs/2207.00758v1
- Date: Sat, 2 Jul 2022 06:54:10 GMT
- Title: MIA 2022 Shared Task: Evaluating Cross-lingual Open-Retrieval Question
Answering for 16 Diverse Languages
- Authors: Akari Asai, Shayne Longpre, Jungo Kasai, Chia-Hsuan Lee, Rui Zhang,
Junjie Hu, Ikuya Yamada, Jonathan H. Clark, Eunsol Choi
- Abstract summary: We evaluate cross-lingual open-retrieval question answering systems in 16 typologically diverse languages.
The best system leveraging iteratively mined diverse negative examples achieves 32.2 F1, outperforming our baseline by 4.5 points.
The second best system uses entity-aware contextualized representations for document retrieval, and achieves significant improvements in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores.
- Score: 54.002969723086075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present the results of the Workshop on Multilingual Information Access
(MIA) 2022 Shared Task, evaluating cross-lingual open-retrieval question
answering (QA) systems in 16 typologically diverse languages. In this task, we
adapted two large-scale cross-lingual open-retrieval QA datasets in 14
typologically diverse languages, and newly annotated open-retrieval QA data in
2 underrepresented languages: Tagalog and Tamil. Four teams submitted their
systems. The best system leveraging iteratively mined diverse negative examples
and larger pretrained models achieves 32.2 F1, outperforming our baseline by
4.5 points. The second best system uses entity-aware contextualized
representations for document retrieval, and achieves significant improvements
in Tamil (20.8 F1), whereas most of the other systems yield nearly zero scores.
Related papers
- Datasets for Multilingual Answer Sentence Selection [59.28492975191415]
We introduce new high-quality datasets for AS2 in five European languages (French, German, Italian, Portuguese, and Spanish)
Results indicate that our datasets are pivotal in producing robust and powerful multilingual AS2 models.
arXiv Detail & Related papers (2024-06-14T16:50:29Z) - MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering [58.92057773071854]
We introduce MTVQA, the first benchmark featuring high-quality human expert annotations across 9 diverse languages.
MTVQA is the first benchmark featuring high-quality human expert annotations across 9 diverse languages.
arXiv Detail & Related papers (2024-05-20T12:35:01Z) - CUNI Submission to MRL 2023 Shared Task on Multi-lingual Multi-task
Information Retrieval [5.97515243922116]
We present the Charles University system for the MRL2023 Shared Task on Multi-lingual Multi-task Information Retrieval.
The goal of the shared task was to develop systems for named entity recognition and question answering in several under-represented languages.
Our solutions to both subtasks rely on the translate-test approach.
arXiv Detail & Related papers (2023-10-25T10:22:49Z) - The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants [80.4837840962273]
We present Belebele, a dataset spanning 122 language variants.
This dataset enables the evaluation of text models in high-, medium-, and low-resource languages.
arXiv Detail & Related papers (2023-08-31T17:43:08Z) - Lila: A Unified Benchmark for Mathematical Reasoning [59.97570380432861]
LILA is a unified mathematical reasoning benchmark consisting of 23 diverse tasks along four dimensions.
We construct our benchmark by extending 20 datasets benchmark by collecting task instructions and solutions in the form of Python programs.
We introduce BHASKARA, a general-purpose mathematical reasoning model trained on LILA.
arXiv Detail & Related papers (2022-10-31T17:41:26Z) - Polyglot Prompt: Multilingual Multitask PrompTraining [35.70124413465395]
This paper aims for a potential architectural breakthrough for multilingual learning and asks: could different tasks from different languages be modeled in a monolithic framework (without any task/language-specific module)?
We approach this goal by developing a learning framework Polyglot Prompt, where prompting methods are introduced to learn a unified semantic space for different languages and tasks after proper multilingual prompt engineering.
arXiv Detail & Related papers (2022-04-29T17:40:50Z) - Facebook AI WMT21 News Translation Task Submission [23.69817809546458]
We describe Facebook's multilingual model submission to the WMT2021 shared task on news translation.
We participate in 14 language directions: English to and from Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese.
We utilize data from all available sources to create high quality bilingual and multilingual baselines.
arXiv Detail & Related papers (2021-08-06T18:26:38Z) - Multilingual Answer Sentence Reranking via Automatically Translated Data [97.98885151955467]
We present a study on the design of multilingual Answer Sentence Selection (AS2) models, which are a core component of modern Question Answering (QA) systems.
The main idea is to transfer data, created from one resource rich language, e.g., English, to other languages, less rich in terms of resources.
arXiv Detail & Related papers (2021-02-20T03:52:08Z) - MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain
Question Answering [6.452012363895865]
This dataset supplies the widest range of languages to-date for evaluating question answering.
We benchmark a variety of state-of-the-art methods and baselines for generative and extractive question answering.
Results indicate this dataset is challenging even in English, but especially in low-resource languages.
arXiv Detail & Related papers (2020-07-30T03:33:46Z) - A Bayesian Multilingual Document Model for Zero-shot Topic Identification and Discovery [1.9215779751499527]
The model is an extension of BaySMM [Kesiraju et al 2020] to the multilingual scenario.
We propagate the learned uncertainties through linear classifiers that benefit zero-shot cross-lingual topic identification.
We revisit cross-lingual topic identification in zero-shot settings by taking a deeper dive into current datasets.
arXiv Detail & Related papers (2020-07-02T19:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.