An Understanding-Oriented Robust Machine Reading Comprehension Model
- URL: http://arxiv.org/abs/2207.00187v1
- Date: Fri, 1 Jul 2022 03:32:02 GMT
- Title: An Understanding-Oriented Robust Machine Reading Comprehension Model
- Authors: Feiliang Ren, Yongkang Liu, Bochao Li, Shilei Liu, Bingchao Wang,
Jiaqi Wang, Chunchao Liu, Qi Ma
- Abstract summary: We propose an understanding-oriented machine reading comprehension model to address three kinds of robustness issues.
Specifically, we first use a natural language inference module to help the model understand the accurate semantic meanings of input questions.
Third, we propose a multilanguage learning mechanism to address the issue of generalization.
- Score: 12.870425062204035
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Although existing machine reading comprehension models are making rapid
progress on many datasets, they are far from robust. In this paper, we propose
an understanding-oriented machine reading comprehension model to address three
kinds of robustness issues, which are over sensitivity, over stability and
generalization. Specifically, we first use a natural language inference module
to help the model understand the accurate semantic meanings of input questions
so as to address the issues of over sensitivity and over stability. Then in the
machine reading comprehension module, we propose a memory-guided multi-head
attention method that can further well understand the semantic meanings of
input questions and passages. Third, we propose a multilanguage learning
mechanism to address the issue of generalization. Finally, these modules are
integrated with a multi-task learning based method. We evaluate our model on
three benchmark datasets that are designed to measure models robustness,
including DuReader (robust) and two SQuAD-related datasets. Extensive
experiments show that our model can well address the mentioned three kinds of
robustness issues. And it achieves much better results than the compared
state-of-the-art models on all these datasets under different evaluation
metrics, even under some extreme and unfair evaluations. The source code of our
work is available at: https://github.com/neukg/RobustMRC.
Related papers
- In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions.
Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Robustness Analysis of Video-Language Models Against Visual and Language
Perturbations [10.862722733649543]
This study is the first extensive study of video-language robustness models against various real-world perturbations.
We propose two large-scale benchmark datasets, MSRVTT-P and YouCook2-P, which utilize 90 different visual and 35 different text perturbations.
arXiv Detail & Related papers (2022-07-05T16:26:05Z) - ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented
Visual Models [102.63817106363597]
We build ELEVATER, the first benchmark to compare and evaluate pre-trained language-augmented visual models.
It consists of 20 image classification datasets and 35 object detection datasets, each of which is augmented with external knowledge.
We will release our toolkit and evaluation platforms for the research community.
arXiv Detail & Related papers (2022-04-19T10:23:42Z) - Deep Understanding based Multi-Document Machine Reading Comprehension [22.319892892352414]
We propose a deep understanding based model for multi-document machine reading comprehension.
It has three cascaded deep understanding modules which are designed to understand the accurate semantic meaning of words.
We evaluate our model on two large scale benchmark datasets, namely TriviaQA Web and DuReader.
arXiv Detail & Related papers (2022-02-25T12:56:02Z) - Leveraging Advantages of Interactive and Non-Interactive Models for
Vector-Based Cross-Lingual Information Retrieval [12.514666775853598]
We propose a novel framework to leverage the advantages of interactive and non-interactive models.
We introduce semi-interactive mechanism, which builds our model upon non-interactive architecture but encodes each document together with its associated multilingual queries.
Our methods significantly boost the retrieval accuracy while maintaining the computational efficiency.
arXiv Detail & Related papers (2021-11-03T03:03:19Z) - Towards Trustworthy Deception Detection: Benchmarking Model Robustness
across Domains, Modalities, and Languages [10.131671217810581]
We evaluate model robustness to out-of-domain data, modality-specific features, and languages other than English.
We find that with additional image content as input, ELMo embeddings yield significantly fewer errors compared to BERT orGLoVe.
arXiv Detail & Related papers (2021-04-23T18:05:52Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine
Reading Comprehension [53.037401638264235]
We present an evaluation server, ORB, that reports performance on seven diverse reading comprehension datasets.
The evaluation server places no restrictions on how models are trained, so it is a suitable test bed for exploring training paradigms and representation learning.
arXiv Detail & Related papers (2019-12-29T07:27:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.