Conversational Question Reformulation via Sequence-to-Sequence
Architectures and Pretrained Language Models
- URL: http://arxiv.org/abs/2004.01909v1
- Date: Sat, 4 Apr 2020 11:07:54 GMT
- Title: Conversational Question Reformulation via Sequence-to-Sequence
Architectures and Pretrained Language Models
- Authors: Sheng-Chieh Lin, Jheng-Hong Yang, Rodrigo Nogueira, Ming-Feng Tsai,
Chuan-Ju Wang, Jimmy Lin
- Abstract summary: This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs)
We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task.
We evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task.
- Score: 56.268862325167575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an empirical study of conversational question
reformulation (CQR) with sequence-to-sequence architectures and pretrained
language models (PLMs). We leverage PLMs to address the strong token-to-token
independence assumption made in the common objective, maximum likelihood
estimation, for the CQR task. In CQR benchmarks of task-oriented dialogue
systems, we evaluate fine-tuned PLMs on the recently-introduced CANARD dataset
as an in-domain task and validate the models using data from the TREC 2019 CAsT
Track as an out-domain task. Examining a variety of architectures with
different numbers of parameters, we demonstrate that the recent text-to-text
transfer transformer (T5) achieves the best results both on CANARD and CAsT
with fewer parameters, compared to similar transformer architectures.
Related papers
- FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Structural Self-Supervised Objectives for Transformers [3.018656336329545]
This thesis focuses on improving the pre-training of natural language models using unsupervised raw data.
In the first part, we introduce three alternative pre-training objectives to BERT's Masked Language Modeling (MLM)
In the second part, we proposes self-supervised pre-training tasks that align structurally with downstream applications.
arXiv Detail & Related papers (2023-09-15T09:30:45Z) - Parameter-Efficient Abstractive Question Answering over Tables or Text [60.86457030988444]
A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and generate natural answers to user queries.
Memory intensive pre-trained language models are adapted to downstream tasks such as QA by fine-tuning the model on QA data in a specific modality like unstructured text or structured tables.
To avoid training such memory-hungry models while utilizing a uniform architecture for each modality, parameter-efficient adapters add and train small task-specific bottle-neck layers between transformer layers.
arXiv Detail & Related papers (2022-04-07T10:56:29Z) - Domain Adaptation with Pre-trained Transformers for Query Focused
Abstractive Text Summarization [18.791701342934605]
The Query Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on a given query.
A key challenge in addressing this task is the lack of large labeled data for training the summarization model.
We address this challenge by exploring a series of domain adaptation techniques.
arXiv Detail & Related papers (2021-12-22T05:34:56Z) - BET: A Backtranslation Approach for Easy Data Augmentation in
Transformer-based Paraphrase Identification Context [0.0]
We call this approach BET by which we analyze the backtranslation data augmentation on the transformer-based architectures.
Our findings suggest that BET improves the paraphrase identification performance on the Microsoft Research Paraphrase Corpus to more than 3% on both accuracy and F1 score.
arXiv Detail & Related papers (2020-09-25T22:06:06Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z) - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation.
We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.