Controlling Pre-trained Language Models for Grade-Specific Text
Simplification
- URL: http://arxiv.org/abs/2305.14993v2
- Date: Thu, 30 Nov 2023 14:14:31 GMT
- Title: Controlling Pre-trained Language Models for Grade-Specific Text
Simplification
- Authors: Sweta Agrawal and Marine Carpuat
- Abstract summary: We study how different control mechanisms impact the adequacy and simplicity of text simplification systems.
We introduce a simple method that predicts the edit operations required for simplifying a text for a specific grade level on an instance-per-instance basis.
- Score: 22.154454849167077
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text simplification (TS) systems rewrite text to make it more readable while
preserving its content. However, what makes a text easy to read depends on the
intended readers. Recent work has shown that pre-trained language models can
simplify text using a wealth of techniques to control output simplicity,
ranging from specifying only the desired reading grade level, to directly
specifying low-level edit operations. Yet it remains unclear how to set these
control parameters in practice. Existing approaches set them at the corpus
level, disregarding the complexity of individual inputs and considering only
one level of output complexity. In this work, we conduct an empirical study to
understand how different control mechanisms impact the adequacy and simplicity
of text simplification systems. Based on these insights, we introduce a simple
method that predicts the edit operations required for simplifying a text for a
specific grade level on an instance-per-instance basis. This approach improves
the quality of the simplified outputs over corpus-level search-based
heuristics.
Related papers
- Automating Easy Read Text Segmentation [2.7309692684728617]
Easy Read text is one of the main forms of access to information for people with reading difficulties.
One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments.
We study novel methods for the task, leveraging masked and generative language models, along with constituent parsing.
arXiv Detail & Related papers (2024-06-17T12:25:25Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Teaching the Pre-trained Model to Generate Simple Texts for Text
Simplification [59.625179404482594]
Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts.
We propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.
arXiv Detail & Related papers (2023-05-21T14:03:49Z) - Composable Text Controls in Latent Space with ODEs [97.12426987887021]
This paper proposes a new efficient approach for composable text operations in the compact latent space of text.
By connecting pretrained LMs to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences.
Experiments show that composing those operators within our approach manages to generate or edit high-quality text.
arXiv Detail & Related papers (2022-08-01T06:51:45Z) - Classifiers are Better Experts for Controllable Text Generation [63.17266060165098]
We show that the proposed method significantly outperforms recent PPLM, GeDi, and DExperts on PPL and sentiment accuracy based on the external classifier of generated texts.
The same time, it is also easier to implement and tune, and has significantly fewer restrictions and requirements.
arXiv Detail & Related papers (2022-05-15T12:58:35Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Text Simplification for Comprehension-based Question-Answering [7.144235435987265]
We release Simple-SQuAD, a simplified version of the widely-used SQuAD dataset.
We benchmark the newly created corpus and perform an ablation study for examining the effect of the simplification process in the SQuAD-based question answering task.
arXiv Detail & Related papers (2021-09-28T18:48:00Z) - CUT: Controllable Unsupervised Text Simplification [0.0]
We propose two unsupervised mechanisms for controlling the output complexity of generated texts.
We show that by nudging a back-translation algorithm to understand the relative simplicity of a text in comparison to its noisy translation, the algorithm self-supervises itself to produce the desired complexity.
arXiv Detail & Related papers (2020-12-03T14:14:30Z) - Explainable Prediction of Text Complexity: The Missing Preliminaries for
Text Simplification [13.447565774887215]
Text simplification reduces the language complexity of professional content for accessibility purposes.
End-to-end neural network models have been widely adopted to directly generate the simplified version of input text.
We show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process.
arXiv Detail & Related papers (2020-07-31T03:33:37Z) - ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification
Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English.
We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.