Related papers: Multilingual Non-Factoid Question Answering with Answer Paragraph Selection

Multilingual Non-Factoid Question Answering with Answer Paragraph Selection

URL: http://arxiv.org/abs/2408.10604v2
Date: Wed, 19 Feb 2025 17:25:39 GMT
Title: Multilingual Non-Factoid Question Answering with Answer Paragraph Selection
Authors: Ritwik Mishra, Sreeram Vennam, Rajiv Ratn Shah, Ponnurangam Kumaraguru,
Abstract summary: This work presents MuNfQuAD, a multilingual QuAD with non-factoid questions.<n>The dataset comprises over 578K QA pairs across 38 languages.
Score: 36.31301773167754
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most existing Question Answering Datasets (QuADs) primarily focus on factoid-based short-context Question Answering (QA) in high-resource languages. However, the scope of such datasets for low-resource languages remains limited, with only a few works centered on factoid-based QuADs and none on non-factoid QuADs. Therefore, this work presents MuNfQuAD, a multilingual QuAD with non-factoid questions. It utilizes interrogative sub-headings from BBC news articles as questions and the corresponding paragraphs as silver answers. The dataset comprises over 578K QA pairs across 38 languages, encompassing several low-resource languages, and stands as the largest multilingual QA dataset to date. Based on the manual annotations of 790 QA-pairs from MuNfQuAD (golden set), we observe that 98\% of questions can be answered using their corresponding silver answer. Our fine-tuned Answer Paragraph Selection (APS) model outperforms the baselines. The APS model attained an accuracy of 80\% and 72\%, as well as a macro F1 of 72\% and 66\%, on the MuNfQuAD testset and the golden set, respectively. Furthermore, the APS model effectively generalizes a certain language within the golden set, even after being fine-tuned on silver labels. We also observe that the fine-tuned APS model is beneficial for reducing the context of a question. These findings suggest that this resource would be a valuable contribution to the QA research community.

Related papers

AmaSQuAD: A Benchmark for Amharic Extractive Question Answering [0.0]
This research presents a novel framework for translating extractive question-answering datasets into low-resource languages. The methodology addresses challenges related to misalignment between translated questions and answers. We fine-tune the XLM-R model on the AmaSQuAD synthetic dataset for Amharic Question-Answering.
arXiv Detail & Related papers (2025-02-04T06:27:39Z)
INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages [26.13077589552484]
Indic-QA is the largest publicly available context-grounded question-answering dataset for 11 major Indian languages from two language families. We generate a synthetic dataset using the Gemini model to create question-answer pairs given a passage, which is then manually verified for quality assurance. We evaluate various multilingual Large Language Models and their instruction-fine-tuned variants on the benchmark and observe that their performance is subpar, particularly for low-resource languages.
arXiv Detail & Related papers (2024-07-18T13:57:16Z)
UQA: Corpus for Urdu Question Answering [3.979019316355144]
This paper introduces UQA, a novel dataset for question answering and text comprehension in Urdu. UQA is generated by translating the Stanford Question Answering dataset (SQuAD2.0), a large-scale English QA dataset. The paper describes the process of selecting and evaluating the best translation model among two candidates: Google Translator and Seamless M4T.
arXiv Detail & Related papers (2024-05-02T16:44:31Z)
Can a Multichoice Dataset be Repurposed for Extractive Question Answering? [52.28197971066953]
We repurposed the Belebele dataset (Bandarkar et al., 2023), which was designed for multiple-choice question answering (MCQA) We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic (MSA). Our aim is to enable others to adapt our approach for the 120+ other language variants in Belebele, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z)
SEMQA: Semi-Extractive Multi-Source Question Answering [94.04430035121136]
We introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion. We create the first dataset of this kind, QuoteSum, with human-written semi-extractive answers to natural and generated questions.
arXiv Detail & Related papers (2023-11-08T18:46:32Z)
Evaluating and Modeling Attribution for Cross-Lingual Question Answering [80.4807682093432]
This work is the first to study attribution for cross-lingual question answering. We collect data in 5 languages to assess the attribution level of a state-of-the-art cross-lingual QA system. We find that a substantial portion of the answers is not attributable to any retrieved passages.
arXiv Detail & Related papers (2023-05-23T17:57:46Z)
AmQA: Amharic Question Answering Dataset [8.509075718695492]
Question Answering (QA) returns concise answers or answer lists from natural language text given a context document. There is no published or publicly available Amharic QA dataset. We crowdsourced 2628 question-answer pairs over 378 Wikipedia articles.
arXiv Detail & Related papers (2023-03-06T17:06:50Z)
RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering [87.18962441714976]
We introduce RoMQA, the first benchmark for robust, multi-evidence, multi-answer question answering (QA) We evaluate state-of-the-art large language models in zero-shot, few-shot, and fine-tuning settings, and find that RoMQA is challenging. Our results show that RoMQA is a challenging benchmark for large language models, and provides a quantifiable test to build more robust QA methods.
arXiv Detail & Related papers (2022-10-25T21:39:36Z)
JaQuAD: Japanese Question Answering Dataset for Machine Reading Comprehension [0.0]
We present the Japanese Question Answering dataset, JaQuAD, which is annotated by humans. JaQuAD consists of 39,696 extractive question-answer pairs on Japanese Wikipedia articles. We finetuned a baseline model which achieves 78.92% for F1 score and 63.38% for EM on test set.
arXiv Detail & Related papers (2022-02-03T18:40:25Z)
Cross-Lingual GenQA: A Language-Agnostic Generative Question Answering Approach for Open-Domain Question Answering [76.99585451345702]
Open-Retrieval Generative Question Answering (GenQA) is proven to deliver high-quality, natural-sounding answers in English. We present the first generalization of the GenQA approach for the multilingual environment.
arXiv Detail & Related papers (2021-10-14T04:36:29Z)
Multilingual Answer Sentence Reranking via Automatically Translated Data [97.98885151955467]
We present a study on the design of multilingual Answer Sentence Selection (AS2) models, which are a core component of modern Question Answering (QA) systems. The main idea is to transfer data, created from one resource rich language, e.g., English, to other languages, less rich in terms of resources.
arXiv Detail & Related papers (2021-02-20T03:52:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.