Bridging the Language Gap: Knowledge Injected Multilingual Question
Answering
- URL: http://arxiv.org/abs/2304.03159v1
- Date: Thu, 6 Apr 2023 15:41:25 GMT
- Title: Bridging the Language Gap: Knowledge Injected Multilingual Question
Answering
- Authors: Zhichao Duan, Xiuxing Li, Zhengyan Zhang, Zhenyu Li, Ning Liu,
Jianyong Wang
- Abstract summary: We propose a generalized cross-lingual transfer framework to enhance the model's ability to understand different languages.
Experiment results on real-world datasets MLQA demonstrate that the proposed method can improve the performance by a large margin.
- Score: 19.768708263635176
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Question Answering (QA) is the task of automatically answering questions
posed by humans in natural languages. There are different settings to answer a
question, such as abstractive, extractive, boolean, and multiple-choice QA. As
a popular topic in natural language processing tasks, extractive question
answering task (extractive QA) has gained extensive attention in the past few
years. With the continuous evolvement of the world, generalized cross-lingual
transfer (G-XLT), where question and answer context are in different languages,
poses some unique challenges over cross-lingual transfer (XLT), where question
and answer context are in the same language. With the boost of corresponding
development of related benchmarks, many works have been done to improve the
performance of various language QA tasks. However, only a few works are
dedicated to the G-XLT task. In this work, we propose a generalized
cross-lingual transfer framework to enhance the model's ability to understand
different languages. Specifically, we first assemble triples from different
languages to form multilingual knowledge. Since the lack of knowledge between
different languages greatly limits models' reasoning ability, we further design
a knowledge injection strategy via leveraging link prediction techniques to
enrich the model storage of multilingual knowledge. In this way, we can
profoundly exploit rich semantic knowledge. Experiment results on real-world
datasets MLQA demonstrate that the proposed method can improve the performance
by a large margin, outperforming the baseline method by 13.18%/12.00% F1/EM on
average.
Related papers
- Can a Multichoice Dataset be Repurposed for Extractive Question Answering? [52.28197971066953]
We repurposed the Belebele dataset (Bandarkar et al., 2023), which was designed for multiple-choice question answering (MCQA)
We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic (MSA).
Our aim is to enable others to adapt our approach for the 120+ other language variants in Belebele, many of which are deemed under-resourced.
arXiv Detail & Related papers (2024-04-26T11:46:05Z) - Applying Multilingual Models to Question Answering (QA) [0.0]
We study the performance of monolingual and multilingual language models on the task of question-answering (QA) on three diverse languages: English, Finnish and Japanese.
We develop models for the tasks of (1) determining if a question is answerable given the context and (2) identifying the answer texts within the context using IOB tagging.
arXiv Detail & Related papers (2022-12-04T21:58:33Z) - Learning to Answer Multilingual and Code-Mixed Questions [4.290420179006601]
Question-answering (QA) that comes naturally to humans is a critical component in seamless human-computer interaction.
Despite being one of the oldest research areas, the current QA system faces the critical challenge of handling multilingual queries.
This dissertation focuses on advancing QA techniques for handling end-user queries in multilingual environments.
arXiv Detail & Related papers (2022-11-14T16:49:58Z) - Delving Deeper into Cross-lingual Visual Question Answering [115.16614806717341]
We show that simple modifications to the standard training setup can substantially reduce the transfer gap to monolingual English performance.
We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers.
arXiv Detail & Related papers (2022-02-15T18:22:18Z) - Cross-Lingual GenQA: A Language-Agnostic Generative Question Answering
Approach for Open-Domain Question Answering [76.99585451345702]
Open-Retrieval Generative Question Answering (GenQA) is proven to deliver high-quality, natural-sounding answers in English.
We present the first generalization of the GenQA approach for the multilingual environment.
arXiv Detail & Related papers (2021-10-14T04:36:29Z) - X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural
Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU)
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.
We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z) - Multilingual Answer Sentence Reranking via Automatically Translated Data [97.98885151955467]
We present a study on the design of multilingual Answer Sentence Selection (AS2) models, which are a core component of modern Question Answering (QA) systems.
The main idea is to transfer data, created from one resource rich language, e.g., English, to other languages, less rich in terms of resources.
arXiv Detail & Related papers (2021-02-20T03:52:08Z) - Multilingual Transfer Learning for QA Using Translation as Data
Augmentation [13.434957024596898]
We explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in the semantic space.
We propose two novel strategies, language adversarial training and language arbitration framework, which significantly improve the (zero-resource) cross-lingual transfer performance.
Empirically, we show that the proposed models outperform the previous zero-shot baseline on the recently introduced multilingual MLQA and TyDiQA datasets.
arXiv Detail & Related papers (2020-12-10T20:29:34Z) - XOR QA: Cross-lingual Open-Retrieval Question Answering [75.20578121267411]
This work extends open-retrieval question answering to a cross-lingual setting.
We construct a large-scale dataset built on questions lacking same-language answers.
arXiv Detail & Related papers (2020-10-22T16:47:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.