Teach model to answer questions after comprehending the document
- URL: http://arxiv.org/abs/2307.08931v1
- Date: Tue, 18 Jul 2023 02:38:02 GMT
- Title: Teach model to answer questions after comprehending the document
- Authors: Ruiqing Sun and Ping Jian
- Abstract summary: Multi-choice Machine Reading (MRC) is a challenging extension of Natural Language Processing (NLP)
We propose a two-stage knowledge distillation method that teaches the model to better comprehend the document by dividing the MRC task into two separate stages.
- Score: 1.4264737570114632
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-choice Machine Reading Comprehension (MRC) is a challenging extension
of Natural Language Processing (NLP) that requires the ability to comprehend
the semantics and logical relationships between entities in a given text. The
MRC task has traditionally been viewed as a process of answering questions
based on the given text. This single-stage approach has often led the network
to concentrate on generating the correct answer, potentially neglecting the
comprehension of the text itself. As a result, many prevalent models have faced
challenges in performing well on this task when dealing with longer texts. In
this paper, we propose a two-stage knowledge distillation method that teaches
the model to better comprehend the document by dividing the MRC task into two
separate stages. Our experimental results show that the student model, when
equipped with our method, achieves significant improvements, demonstrating the
effectiveness of our method.
Related papers
- From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision [12.023661884821554]
We introduce an innovative two-stage framework that adeptly transfers mathematical Expertise from large to tiny language models.
Our method fully leverages the semantic understanding capabilities during the searching 'problem-equation' pair.
It demonstrates significantly improved performance on the Math23K and Weak12K datasets compared to existing small model methods.
arXiv Detail & Related papers (2024-03-21T13:29:54Z) - Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - Effective Cross-Task Transfer Learning for Explainable Natural Language
Inference with T5 [50.574918785575655]
We compare sequential fine-tuning with a model for multi-task learning in the context of boosting performance on two tasks.
Our results show that while sequential multi-task learning can be tuned to be good at the first of two target tasks, it performs less well on the second and additionally struggles with overfitting.
arXiv Detail & Related papers (2022-10-31T13:26:08Z) - Can Unsupervised Knowledge Transfer from Social Discussions Help
Argument Mining? [25.43442712037725]
We propose a novel transfer learning strategy to overcome the challenges of unsupervised, argumentative discourse-aware knowledge.
We utilize argumentation-rich social discussions from the ChangeMyView subreddit as a source of unsupervised, argumentative discourse-aware knowledge.
We introduce a novel prompt-based strategy for inter-component relation prediction that compliments our proposed finetuning method.
arXiv Detail & Related papers (2022-03-24T06:48:56Z) - Bridging the Gap between Language Model and Reading Comprehension:
Unsupervised MRC via Self-Supervision [34.01738910736325]
We propose a new framework for unsupervised machine reading comprehension (MRC)
We learn to spot answer spans in documents via self-supervised learning, by designing a self-supervision pretext task for MRC - Spotting-MLM.
Experiments show that our method achieves a new state-of-the-art performance for unsupervised MRC.
arXiv Detail & Related papers (2021-07-19T02:14:36Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z) - Enhancing Answer Boundary Detection for Multilingual Machine Reading
Comprehension [86.1617182312817]
We propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision.
A mixed Machine Reading task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs.
A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web.
arXiv Detail & Related papers (2020-04-29T10:44:00Z) - Knowledgeable Dialogue Reading Comprehension on Key Turns [84.1784903043884]
Multi-choice machine reading comprehension (MRC) requires models to choose the correct answer from candidate options given a passage and a question.
Our research focuses dialogue-based MRC, where the passages are multi-turn dialogues.
It suffers from two challenges, the answer selection decision is made without support of latently helpful commonsense, and the multi-turn context may hide considerable irrelevant information.
arXiv Detail & Related papers (2020-04-29T07:04:43Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.