Related papers: SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words

URL: http://arxiv.org/abs/2204.07779v1
Date: Sat, 16 Apr 2022 11:28:01 GMT
Title: SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words
Authors: Renliang Sun and Xiaojun Wan
Abstract summary: In this work, we propose a continued pre-training method for text simplification. We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words. We obtain SimpleBERT, which surpasses BERT in both lexical simplification and sentence simplification tasks.
Score: 59.142185753887645
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pre-trained models are widely used in the tasks of natural language processing nowadays. However, in the specific field of text simplification, the research on improving pre-trained models is still blank. In this work, we propose a continued pre-training method for text simplification. Specifically, we propose a new masked language modeling (MLM) mechanism, which does not randomly mask words but only masks simple words. The new mechanism can make the model learn to generate simple words. We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words from the texts. We choose BERT, a representative pre-trained model, and continue pre-training it using our proposed method. Finally, we obtain SimpleBERT, which surpasses BERT in both lexical simplification and sentence simplification tasks and has achieved state-of-the-art results on multiple datasets. What's more, SimpleBERT can replace BERT in existing simplification models without modification.

Related papers

Extracting Sentence Embeddings from Pretrained Transformer Models [0.0]
We show that representation-shaping techniques significantly improve sentence embeddings extracted from BERT-based and simple baseline models. All methods are tested on eight Semantic Textual Similarity (STS), six short text clustering, and twelve classification tasks. Very high improvements for static token-based models, especially random embeddings for STS tasks, almost reach the performance of BERT-derived representations.
arXiv Detail & Related papers (2024-08-15T10:54:55Z)
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation. We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z)
Controlled Text Generation via Language Model Arithmetic [7.687678490751105]
We introduce model arithmetic, a novel inference framework for composing and biasing Large Language Models. We show that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction.
arXiv Detail & Related papers (2023-11-24T13:41:12Z)
Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification [59.625179404482594]
Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts. We propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.
arXiv Detail & Related papers (2023-05-21T14:03:49Z)
TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models [18.49325959450621]
We introduce TextPruner, an open-source model pruning toolkit for pre-trained language models. TextPruner offers structured post-training pruning methods, including vocabulary pruning and transformer pruning. Our experiments with several NLP tasks demonstrate the ability of TextPruner to reduce the model size without re-training the model.
arXiv Detail & Related papers (2022-03-30T02:10:33Z)
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing [78.8500633981247]
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning" Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly.
arXiv Detail & Related papers (2021-07-28T18:09:46Z)
A Comprehensive Comparison of Pre-training Language Models [0.5139874302398955]
We pre-train a list of transformer-based models with the same amount of text and the same training steps. The experimental results show that the most improvement upon the origin BERT is adding the RNN-layer to capture more contextual information for short text understanding.
arXiv Detail & Related papers (2021-06-22T02:12:29Z)
CharBERT: Character-aware Pre-trained Language Model [36.9333890698306]
We propose a character-aware pre-trained language model named CharBERT. We first construct the contextual word embedding for each token from the sequential character representations. We then fuse the representations of characters and the subword representations by a novel heterogeneous interaction module.
arXiv Detail & Related papers (2020-11-03T07:13:06Z)
Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously. We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z)
ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English. We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.