Enabling Language Models to Fill in the Blanks
- URL: http://arxiv.org/abs/2005.05339v2
- Date: Thu, 10 Sep 2020 18:03:11 GMT
- Title: Enabling Language Models to Fill in the Blanks
- Authors: Chris Donahue, Mina Lee, Percy Liang
- Abstract summary: We present a simple approach for text infilling, the task of predicting missing spans of text at any position in a document.
We train (or fine-tune) off-the-shelf language models on sequences containing the concatenation of artificially-masked text and the text which was masked.
We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics.
- Score: 81.59381915581892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a simple approach for text infilling, the task of predicting
missing spans of text at any position in a document. While infilling could
enable rich functionality especially for writing assistance tools, more
attention has been devoted to language modeling---a special case of infilling
where text is predicted at the end of a document. In this paper, we aim to
extend the capabilities of language models (LMs) to the more general task of
infilling. To this end, we train (or fine-tune) off-the-shelf LMs on sequences
containing the concatenation of artificially-masked text and the text which was
masked. We show that this approach, which we call infilling by language
modeling, can enable LMs to infill entire sentences effectively on three
different domains: short stories, scientific abstracts, and lyrics.
Furthermore, we show that humans have difficulty identifying sentences infilled
by our approach as machine-generated in the domain of short stories.
Related papers
- Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation [9.703886326323644]
We introduce a new model - Segment any Text (SaT) - to solve this problem.
To enhance robustness, we propose a new pretraining scheme that ensures less reliance on punctuation.
To address adaptability, we introduce an extra stage of parameter-efficient fine-tuning, establishing state-of-the-art performance in distinct domains.
arXiv Detail & Related papers (2024-06-24T14:36:11Z) - Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens [21.61634020256455]
Transformer-based large language models (LLMs) suffer a performance degradation when modeling long-term contexts.
We propose a simple yet effective method to enable LLMs to take a deep breath, encouraging them to summarize information contained within discrete text chunks.
arXiv Detail & Related papers (2024-06-16T15:50:10Z) - Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - Towards Improving Document Understanding: An Exploration on
Text-Grounding via MLLMs [96.54224331778195]
We present a text-grounding document understanding model, termed TGDoc, which enhances MLLMs with the ability to discern the spatial positioning of text within images.
We formulate instruction tuning tasks including text detection, recognition, and spotting to facilitate the cohesive alignment between the visual encoder and large language model.
Our method achieves state-of-the-art performance across multiple text-rich benchmarks, validating the effectiveness of our method.
arXiv Detail & Related papers (2023-11-22T06:46:37Z) - PerPLM: Personalized Fine-tuning of Pretrained Language Models via
Writer-specific Intermediate Learning and Prompts [16.59511985633798]
Pretrained language models (PLMs) are powerful tools for capturing context.
PLMs are typically pretrained and fine-tuned for universal use across different writers.
This study aims to improve the accuracy of text understanding tasks by personalizing the fine-tuning of PLMs for specific writers.
arXiv Detail & Related papers (2023-09-14T14:03:48Z) - Controlled Text Reduction [15.102190738450092]
We formalize textitControlled Text Reduction as a standalone task.
A model then needs to generate a coherent text that includes all and only the target information.
arXiv Detail & Related papers (2022-10-24T17:59:03Z) - SCROLLS: Standardized CompaRison Over Long Language Sequences [62.574959194373264]
We introduce SCROLLS, a suite of tasks that require reasoning over long texts.
SCROLLS contains summarization, question answering, and natural language inference tasks.
We make all datasets available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.
arXiv Detail & Related papers (2022-01-10T18:47:15Z) - Pre-training Language Model Incorporating Domain-specific Heterogeneous Knowledge into A Unified Representation [49.89831914386982]
We propose a unified pre-trained language model (PLM) for all forms of text, including unstructured text, semi-structured text, and well-structured text.
Our approach outperforms the pre-training of plain text using only 1/4 of the data.
arXiv Detail & Related papers (2021-09-02T16:05:24Z) - Pre-training via Paraphrasing [96.79972492585112]
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual paraphrasing objective.
We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization.
For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation.
arXiv Detail & Related papers (2020-06-26T14:43:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.