XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based
Textual Knowledge Source
- URL: http://arxiv.org/abs/2204.07002v1
- Date: Thu, 14 Apr 2022 14:54:33 GMT
- Title: XLMRQA: Open-Domain Question Answering on Vietnamese Wikipedia-based
Textual Knowledge Source
- Authors: Kiet Van Nguyen and Phong Nguyen-Thuan Do and Nhat Duy Nguyen and Tin
Van Huynh and Anh Gia-Tuan Nguyen and Ngan Luu-Thuy Nguyen
- Abstract summary: This paper presents XLMRQA, the first Vietnamese QA system using a supervised transformer-based reader on the Wikipedia-based textual knowledge source.
From the results obtained on the three systems, we analyze the influence of question types on the performance of the QA systems.
- Score: 2.348805691644086
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Question answering (QA) is a natural language understanding task within the
fields of information retrieval and information extraction that has attracted
much attention from the computational linguistics and artificial intelligence
research community in recent years because of the strong development of machine
reading comprehension-based models. A reader-based QA system is a high-level
search engine that can find correct answers to queries or questions in
open-domain or domain-specific texts using machine reading comprehension (MRC)
techniques. The majority of advancements in data resources and machine-learning
approaches in the MRC and QA systems, on the other hand, especially in two
resource-rich languages such as English and Chinese. A low-resource language
like Vietnamese has witnessed a scarcity of research on QA systems. This paper
presents XLMRQA, the first Vietnamese QA system using a supervised
transformer-based reader on the Wikipedia-based textual knowledge source (using
the UIT-ViQuAD corpus), outperforming the two robust QA systems using deep
neural network models: DrQA and BERTserini with 24.46% and 6.28%, respectively.
From the results obtained on the three systems, we analyze the influence of
question types on the performance of the QA systems.
Related papers
- EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems [103.91826112815384]
citation-based QA systems are suffering from two shortcomings.
They usually rely only on web as a source of extracted knowledge and adding other external knowledge sources can hamper the efficiency of the system.
We propose our enhanced web and efficient knowledge graph (KG) retrieval solution (EWEK-QA) to enrich the content of the extracted knowledge fed to the system.
arXiv Detail & Related papers (2024-06-14T19:40:38Z) - Building Efficient and Effective OpenQA Systems for Low-Resource Languages [17.64851283209797]
We show that effective, low-cost OpenQA systems can be developed for low-resource contexts.
Key ingredients are weak supervision using machine-translated labeled datasets and a relevant unstructured knowledge source.
We present SQuAD-TR, a machine translation of SQuAD2.0, and we build our OpenQA system by adapting ColBERT-QA and retraining it over Turkish resources.
arXiv Detail & Related papers (2024-01-07T22:11:36Z) - Evaluating and Modeling Attribution for Cross-Lingual Question Answering [80.4807682093432]
This work is the first to study attribution for cross-lingual question answering.
We collect data in 5 languages to assess the attribution level of a state-of-the-art cross-lingual QA system.
We find that a substantial portion of the answers is not attributable to any retrieved passages.
arXiv Detail & Related papers (2023-05-23T17:57:46Z) - PIE-QG: Paraphrased Information Extraction for Unsupervised Question
Generation from Small Corpora [4.721845865189576]
PIE-QG uses Open Information Extraction (OpenIE) to generate synthetic training questions from paraphrased passages.
Triples in the form of subject, predicate, object> are extracted from each passage, and questions are formed with subjects (or objects) and predicates while objects (or subjects) are considered as answers.
arXiv Detail & Related papers (2023-01-03T12:20:51Z) - Utilizing Background Knowledge for Robust Reasoning over Traffic
Situations [63.45021731775964]
We focus on a complementary research aspect of Intelligent Transportation: traffic understanding.
We scope our study to text-based methods and datasets given the abundant commonsense knowledge.
We adopt three knowledge-driven approaches for zero-shot QA over traffic situations.
arXiv Detail & Related papers (2022-12-04T09:17:24Z) - A Comparative Study of Question Answering over Knowledge Bases [2.6135123648293717]
Question answering over knowledge bases (KBQA) has become a popular approach to help users extract information from knowledge bases.
We provide a comparative study of six representative KBQA systems on eight benchmark datasets.
We propose an advanced mapping algorithm to aid existing models in achieving superior results.
arXiv Detail & Related papers (2022-11-15T14:23:47Z) - Learning to Answer Multilingual and Code-Mixed Questions [4.290420179006601]
Question-answering (QA) that comes naturally to humans is a critical component in seamless human-computer interaction.
Despite being one of the oldest research areas, the current QA system faces the critical challenge of handling multilingual queries.
This dissertation focuses on advancing QA techniques for handling end-user queries in multilingual environments.
arXiv Detail & Related papers (2022-11-14T16:49:58Z) - Asking for Knowledge: Training RL Agents to Query External Knowledge
Using Language [121.56329458876655]
We introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld.
We propose the "Asking for Knowledge" (AFK) agent, which learns to generate language commands to query for meaningful knowledge.
arXiv Detail & Related papers (2022-05-12T14:20:31Z) - Improving Unsupervised Question Answering via Summarization-Informed
Question Generation [47.96911338198302]
Question Generation (QG) is the task of generating a plausible question for a passage, answer> pair.
We make use of freely available news summary data, transforming declarative sentences into appropriate questions using dependency parsing, named entity recognition and semantic role labeling.
The resulting questions are then combined with the original news articles to train an end-to-end neural QG model.
arXiv Detail & Related papers (2021-09-16T13:08:43Z) - Retrieving and Reading: A Comprehensive Survey on Open-domain Question
Answering [62.88322725956294]
We review the latest research trends in OpenQA, with particular attention to systems that incorporate neural MRC techniques.
We introduce modern OpenQA architecture named Retriever-Reader'' and analyze the various systems that follow this architecture.
We then discuss key challenges to developing OpenQA systems and offer an analysis of benchmarks that are commonly used.
arXiv Detail & Related papers (2021-01-04T04:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.