Bridging the Gap between Language Model and Reading Comprehension:
Unsupervised MRC via Self-Supervision
- URL: http://arxiv.org/abs/2107.08582v1
- Date: Mon, 19 Jul 2021 02:14:36 GMT
- Title: Bridging the Gap between Language Model and Reading Comprehension:
Unsupervised MRC via Self-Supervision
- Authors: Ning Bian, Xianpei Han, Bo Chen, Hongyu Lin, Ben He, Le Sun
- Abstract summary: We propose a new framework for unsupervised machine reading comprehension (MRC)
We learn to spot answer spans in documents via self-supervised learning, by designing a self-supervision pretext task for MRC - Spotting-MLM.
Experiments show that our method achieves a new state-of-the-art performance for unsupervised MRC.
- Score: 34.01738910736325
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite recent success in machine reading comprehension (MRC), learning
high-quality MRC models still requires large-scale labeled training data, even
using strong pre-trained language models (PLMs). The pre-training tasks for
PLMs are not question-answering or MRC-based tasks, making existing PLMs unable
to be directly used for unsupervised MRC. Specifically, MRC aims to spot an
accurate answer span from the given document, but PLMs focus on token filling
in sentences. In this paper, we propose a new framework for unsupervised MRC.
Firstly, we propose to learn to spot answer spans in documents via
self-supervised learning, by designing a self-supervision pretext task for MRC
- Spotting-MLM. Solving this task requires capturing deep interactions between
sentences in documents. Secondly, we apply a simple sentence rewriting strategy
in the inference stage to alleviate the expression mismatch between questions
and documents. Experiments show that our method achieves a new state-of-the-art
performance for unsupervised MRC.
Related papers
- Teach model to answer questions after comprehending the document [1.4264737570114632]
Multi-choice Machine Reading (MRC) is a challenging extension of Natural Language Processing (NLP)
We propose a two-stage knowledge distillation method that teaches the model to better comprehend the document by dividing the MRC task into two separate stages.
arXiv Detail & Related papers (2023-07-18T02:38:02Z) - Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A
Preliminary Study on Writing Assistance [60.40541387785977]
Small foundational models can display remarkable proficiency in tackling diverse tasks when fine-tuned using instruction-driven data.
In this work, we investigate a practical problem setting where the primary focus is on one or a few particular tasks rather than general-purpose instruction following.
Experimental results show that fine-tuning LLaMA on writing instruction data significantly improves its ability on writing tasks.
arXiv Detail & Related papers (2023-05-22T16:56:44Z) - KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive
Question Answering [28.18555591429343]
We propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP)
Instead of adding pointer heads to PLMs, we transform the task into a non-autoregressive Masked Language Modeling (MLM) generation problem.
Our method consistently outperforms state-of-the-art approaches in few-shot settings by a large margin.
arXiv Detail & Related papers (2022-05-06T08:31:02Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - Analysing the Effect of Masking Length Distribution of MLM: An
Evaluation Framework and Case Study on Chinese MRC Datasets [0.8566457170664925]
Masked language model (MLM) is a self-trained training objective widely used in various PTMs.
In different machine reading comprehension tasks, the length of the answer is also different, and the answer is often a word, phrase, or sentence.
In this paper, we try to uncover how much of four's success in the machine reading comprehension tasks comes from the correlation between masking length distribution and answer length in MRC dataset.
arXiv Detail & Related papers (2021-09-29T04:07:05Z) - REPT: Bridging Language Models and Machine Reading Comprehensionvia
Retrieval-Based Pre-training [45.21249008835556]
We present REPT, a REtrieval-based Pre-Training approach to bridge the gap between general PLMs and MRC.
In particular, we introduce two self-supervised tasks to strengthen evidence extraction during pre-training.
Our approach is able to enhance the capacity of evidence extraction without explicit supervision.
arXiv Detail & Related papers (2021-05-10T08:54:46Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - Enhancing Answer Boundary Detection for Multilingual Machine Reading
Comprehension [86.1617182312817]
We propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision.
A mixed Machine Reading task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs.
A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web.
arXiv Detail & Related papers (2020-04-29T10:44:00Z) - Retrospective Reader for Machine Reading Comprehension [90.6069071495214]
Machine reading comprehension (MRC) is an AI challenge that requires machine to determine the correct answers to questions based on a given passage.
When unanswerable questions are involved in the MRC task, an essential verification module called verifier is especially required in addition to the encoder.
This paper devotes itself to exploring better verifier design for the MRC task with unanswerable questions.
arXiv Detail & Related papers (2020-01-27T11:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.