Adversarial Training for Machine Reading Comprehension with Virtual
Embeddings
- URL: http://arxiv.org/abs/2106.04437v1
- Date: Tue, 8 Jun 2021 15:16:34 GMT
- Title: Adversarial Training for Machine Reading Comprehension with Virtual
Embeddings
- Authors: Ziqing Yang, Yiming Cui, Chenglei Si, Wanxiang Che, Ting Liu, Shijin
Wang, Guoping Hu
- Abstract summary: We propose a novel adversarial training method called PQAT that perturbs the embedding matrix instead of word vectors.
We test the method on a wide range of machine reading comprehension tasks, including span-based extractive RC and multiple-choice RC.
The results show that adversarial training is effective universally, and PQAT further improves the performance.
- Score: 45.12957199981406
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training (AT) as a regularization method has proved its
effectiveness on various tasks. Though there are successful applications of AT
on some NLP tasks, the distinguishing characteristics of NLP tasks have not
been exploited. In this paper, we aim to apply AT on machine reading
comprehension (MRC) tasks. Furthermore, we adapt AT for MRC tasks by proposing
a novel adversarial training method called PQAT that perturbs the embedding
matrix instead of word vectors. To differentiate the roles of passages and
questions, PQAT uses additional virtual P/Q-embedding matrices to gather the
global perturbations of words from passages and questions separately. We test
the method on a wide range of MRC tasks, including span-based extractive RC and
multiple-choice RC. The results show that adversarial training is effective
universally, and PQAT further improves the performance.
Related papers
- Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.
We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.
We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - Online inductive learning from answer sets for efficient reinforcement learning exploration [52.03682298194168]
We exploit inductive learning of answer set programs to learn a set of logical rules representing an explainable approximation of the agent policy.
We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch.
Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training.
arXiv Detail & Related papers (2025-01-13T16:13:22Z) - Efficient Cross-Task Prompt Tuning for Few-Shot Conversational Emotion
Recognition [6.988000604392974]
Emotion Recognition in Conversation (ERC) has been widely studied due to its importance in developing emotion-aware empathetic machines.
We propose a derivative-free optimization method called Cross-Task Prompt Tuning (CTPT) for few-shot conversational emotion recognition.
arXiv Detail & Related papers (2023-10-23T06:46:03Z) - Rethinking Label Smoothing on Multi-hop Question Answering [87.68071401870283]
Multi-Hop Question Answering (MHQA) is a significant area in question answering.
In this work, we analyze the primary factors limiting the performance of multi-hop reasoning.
We propose a novel label smoothing technique, F1 Smoothing, which incorporates uncertainty into the learning process.
arXiv Detail & Related papers (2022-12-19T14:48:08Z) - KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive
Question Answering [28.18555591429343]
We propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP)
Instead of adding pointer heads to PLMs, we transform the task into a non-autoregressive Masked Language Modeling (MLM) generation problem.
Our method consistently outperforms state-of-the-art approaches in few-shot settings by a large margin.
arXiv Detail & Related papers (2022-05-06T08:31:02Z) - Making Pre-trained Language Models End-to-end Few-shot Learners with
Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models.
It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters.
Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z) - Learning to Ask Conversational Questions by Optimizing Levenshtein
Distance [83.53855889592734]
We introduce a Reinforcement Iterative Sequence Editing (RISE) framework that optimize the minimum Levenshtein distance (MLD) through explicit editing actions.
RISE is able to pay attention to tokens that are related to conversational characteristics.
Experimental results on two benchmark datasets show that RISE significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2021-06-30T08:44:19Z) - Weighted Training for Cross-Task Learning [71.94908559469475]
We introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning.
We show that TAWT is easy to implement, is computationally efficient, requires little hyper parameter tuning, and enjoys non-asymptotic learning-theoretic guarantees.
As a byproduct, the proposed representation-based task distance allows one to reason in a theoretically principled way about several critical aspects of cross-task learning.
arXiv Detail & Related papers (2021-05-28T20:27:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.