VLSP 2021 Shared Task: Vietnamese Machine Reading Comprehension
- URL: http://arxiv.org/abs/2203.11400v1
- Date: Tue, 22 Mar 2022 00:44:41 GMT
- Title: VLSP 2021 Shared Task: Vietnamese Machine Reading Comprehension
- Authors: Kiet Van Nguyen, Son Quoc Tran, Luan Thanh Nguyen, Tin Van Huynh, Son
T. Luu, Ngan Luu-Thuy Nguyen
- Abstract summary: This article presents details of the organization of the shared task, an overview of the methods employed by shared-task participants, and the results.
We provide a benchmark dataset named UIT-ViQuAD 2.0 for evaluating the MRC task and question answering systems for the Vietnamese language.
The UIT-ViQuAD 2.0 dataset motivates more researchers to explore Vietnamese machine reading comprehension, question answering, and question generation.
- Score: 2.348805691644086
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: One of the emerging research trends in natural language understanding is
machine reading comprehension (MRC) which is the task to find answers to human
questions based on textual data. Existing Vietnamese datasets for MRC research
concentrate solely on answerable questions. However, in reality, questions can
be unanswerable for which the correct answer is not stated in the given textual
data. To address the weakness, we provide the research community with a
benchmark dataset named UIT-ViQuAD 2.0 for evaluating the MRC task and question
answering systems for the Vietnamese language. We use UIT-ViQuAD 2.0 as a
benchmark dataset for the shared task on Vietnamese MRC at the Eighth Workshop
on Vietnamese Language and Speech Processing (VLSP 2021). This task attracted
77 participant teams from 34 universities and other organizations. In this
article, we present details of the organization of the shared task, an overview
of the methods employed by shared-task participants, and the results. The
highest performances are 77.24% EM and 67.43% F1-score on the private test set.
The Vietnamese MRC systems proposed by the top 3 teams use XLM-RoBERTa, a
powerful pre-trained language model using the transformer architecture. The
UIT-ViQuAD 2.0 dataset motivates more researchers to explore Vietnamese machine
reading comprehension, question answering, and question generation.
Related papers
- 3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset [90.95948101052073]
We introduce 3AM, an ambiguity-aware MMT dataset comprising 26,000 parallel sentence pairs in English and Chinese.
Our dataset is specifically designed to include more ambiguity and a greater variety of both captions and images than other MMT datasets.
Experimental results show that MMT models trained on our dataset exhibit a greater ability to exploit visual information than those trained on other MMT datasets.
arXiv Detail & Related papers (2024-04-29T04:01:30Z) - VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension [1.3942150186842373]
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension tasks.
The existing MRC corpora in Vietnamese mainly focus on formal written documents such as Wikipedia articles, online newspapers, or textbooks.
In contrast, the VlogQA consists of 10,076 question-answer pairs based on 1,230 transcript documents sourced from YouTube.
arXiv Detail & Related papers (2024-02-05T00:54:40Z) - Look Before You Leap: A Universal Emergent Decomposition of Retrieval
Tasks in Language Models [58.57279229066477]
We study how language models (LMs) solve retrieval tasks in diverse situations.
We introduce ORION, a collection of structured retrieval tasks spanning six domains.
We find that LMs internally decompose retrieval tasks in a modular way.
arXiv Detail & Related papers (2023-12-13T18:36:43Z) - Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - Evaluating the Symbol Binding Ability of Large Language Models for
Multiple-Choice Questions in Vietnamese General Education [0.16317061277457]
We evaluate the ability of large language models (LLMs) to perform multiple choice symbol binding (MCSB) for multiple choice question answering (MCQA) tasks in zero-shot, one-shot, and few-shot settings.
This dataset can be used to evaluate the MCSB ability of LLMs and smaller language models (LMs) because it is typed in a strict style.
arXiv Detail & Related papers (2023-10-18T15:48:07Z) - Interactive Mobile App Navigation with Uncertain or Under-specified
Natural Language Commands [47.282510186109775]
We introduce Mobile app Tasks with Iterative Feedback (MoTIF), a new dataset where the goal is to complete a natural language query in a mobile app.
Current datasets for related tasks in interactive question answering, visual common sense reasoning, and question-answer plausibility prediction do not support research in resolving ambiguous natural language requests.
MoTIF contains natural language requests that are not satisfiable, the first such work to investigate this issue for interactive vision-language tasks.
arXiv Detail & Related papers (2022-02-04T18:51:50Z) - Sentence Extraction-Based Machine Reading Comprehension for Vietnamese [0.2446672595462589]
We introduce the UIT-ViWikiQA, the first dataset for evaluating sentence extraction-based machine reading comprehension in Vietnamese language.
The dataset consists of comprises 23.074 question-answers based on 5.109 passages of 174 Vietnamese articles from Wikipedia.
Our experiments show that the best machine model is XLM-R$_Large, which achieves an exact match (EM) score of 85.97% and an F1-score of 88.77% on our dataset.
arXiv Detail & Related papers (2021-05-19T10:22:27Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z) - A Vietnamese Dataset for Evaluating Machine Reading Comprehension [2.7528170226206443]
We present UIT-ViQuAD, a new dataset for the low-resource language as Vietnamese to evaluate machine reading comprehension models.
This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages of 174 Vietnamese articles from Wikipedia.
We conduct experiments on state-of-the-art MRC methods for English and Chinese as the first experimental models on UIT-ViQuAD.
arXiv Detail & Related papers (2020-09-30T15:06:56Z) - A Sentence Cloze Dataset for Chinese Machine Reading Comprehension [64.07894249743767]
We propose a new task called Sentence Cloze-style Machine Reading (SC-MRC)
The proposed task aims to fill the right candidate sentence into the passage that has several blanks.
We built a Chinese dataset called CMRC 2019 to evaluate the difficulty of the SC-MRC task.
arXiv Detail & Related papers (2020-04-07T04:09:00Z) - Enhancing lexical-based approach with external knowledge for Vietnamese
multiple-choice machine reading comprehension [2.5199066832791535]
We construct a dataset which consists of 2,783 pairs of multiple-choice questions and answers based on 417 Vietnamese texts.
We propose a lexical-based MRC method that utilizes semantic similarity measures and external knowledge sources to analyze questions and extract answers from the given text.
Our proposed method achieves 61.81% by accuracy, which is 5.51% higher than the best baseline model.
arXiv Detail & Related papers (2020-01-16T08:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.