Related papers: Text-to-Text Pre-Training for Data-to-Text Tasks

Text-to-Text Pre-Training for Data-to-Text Tasks

URL: http://arxiv.org/abs/2005.10433v3
Date: Fri, 9 Jul 2021 00:42:32 GMT
Title: Text-to-Text Pre-Training for Data-to-Text Tasks
Authors: Mihir Kale, Abhinav Rastogi
Abstract summary: We study the pre-train + fine-tune strategy for data-to-text tasks. Our experiments indicate that text-to-text pre-training in the form of T5 enables simple, end-to-end transformer based models.
Score: 9.690158790639131
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the pre-train + fine-tune strategy for data-to-text tasks. Our experiments indicate that text-to-text pre-training in the form of T5, enables simple, end-to-end transformer based models to outperform pipelined neural architectures tailored for data-to-text generation, as well as alternative language model based pre-training techniques such as BERT and GPT-2. Importantly, T5 pre-training leads to better generalization, as evidenced by large improvements on out-of-domain test sets. We hope our work serves as a useful baseline for future research, as transfer learning becomes ever more prevalent for data-to-text tasks.

Related papers

VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models [5.713983191152314]
VTechAGP is the first academic-to-general-audience text paraphrase dataset. For training, we leverage a contrastive-generative loss function to learn the keyword vectors in the dynamic prompt. For inference, we adopt a crowd-sampling decoding strategy at both semantic and structural levels.
arXiv Detail & Related papers (2024-11-07T16:06:00Z)
ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model [0.0]
We introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD. By harnessing the power of a text-to-text architecture, ViHateT5 can tackle multiple tasks using a unified model and achieve state-of-the-art performance across all standard HSD benchmarks in Vietnamese.
arXiv Detail & Related papers (2024-05-23T03:31:50Z)
Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences. We formulate each task as a sequence-to-sequence problem and perform multi-task training. We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z)
Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI [0.8889304968879164]
We investigate the ability of pre-trained language models to generalize to different non-language tasks. The four pre-trained models that we used, T5, BART, BERT, and GPT-2 achieve outstanding results.
arXiv Detail & Related papers (2023-06-21T11:55:17Z)
GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [27.901155229342375]
We present a novel approach for detecting ChatGPT-generated vs. human-written text using language models. Our models achieved remarkable results, with an accuracy of over 97% on the test dataset, as evaluated through various metrics.
arXiv Detail & Related papers (2023-05-13T17:12:11Z)
Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation. Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z)
Evaluation of Transfer Learning for Polish with a Text-to-Text Model [54.81823151748415]
We introduce a new benchmark for assessing the quality of text-to-text models for Polish. The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering. We present plT5 - a general-purpose text-to-text model for Polish that can be fine-tuned on various Natural Language Processing (NLP) tasks with a single training objective.
arXiv Detail & Related papers (2022-05-18T09:17:14Z)
DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text Generation in E-commerce Title and Review Summarization [14.414693156937782]
We propose a novel domain-specific generative pre-training (DS-GPT) method for text generation. We apply it to the product titleand review summarization problems on E-commerce mobile display.
arXiv Detail & Related papers (2021-12-15T19:02:49Z)
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs [51.67970832510462]
We improve multilingual text-to-text transfer Transformer with translation pairs (mT6) We explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption. Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.
arXiv Detail & Related papers (2021-04-18T03:24:07Z)
Data-to-Text Generation with Iterative Text Editing [3.42658286826597]
We present a novel approach to data-to-text generation based on iterative text editing. We first transform data items to text using trivial templates, and then we iteratively improve the resulting text by a neural model trained for the sentence fusion task. The output of the model is filtered by a simple and reranked with an off-the-shelf pre-trained language model.
arXiv Detail & Related papers (2020-11-03T13:32:38Z)
mT5: A massively multilingual pre-trained text-to-text transformer [60.0210636815514]
"Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on English-language NLP tasks. We introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages.
arXiv Detail & Related papers (2020-10-22T17:58:14Z)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP) In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.