SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words
- URL: http://arxiv.org/abs/2204.07779v1
- Date: Sat, 16 Apr 2022 11:28:01 GMT
- Title: SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words
- Authors: Renliang Sun and Xiaojun Wan
- Abstract summary: In this work, we propose a continued pre-training method for text simplification.
We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words.
We obtain SimpleBERT, which surpasses BERT in both lexical simplification and sentence simplification tasks.
- Score: 59.142185753887645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained models are widely used in the tasks of natural language
processing nowadays. However, in the specific field of text simplification, the
research on improving pre-trained models is still blank. In this work, we
propose a continued pre-training method for text simplification. Specifically,
we propose a new masked language modeling (MLM) mechanism, which does not
randomly mask words but only masks simple words. The new mechanism can make the
model learn to generate simple words. We use a small-scale simple text dataset
for continued pre-training and employ two methods to identify simple words from
the texts. We choose BERT, a representative pre-trained model, and continue
pre-training it using our proposed method. Finally, we obtain SimpleBERT, which
surpasses BERT in both lexical simplification and sentence simplification tasks
and has achieved state-of-the-art results on multiple datasets. What's more,
SimpleBERT can replace BERT in existing simplification models without
modification.
Related papers
- Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation.
We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z) - Controlled Text Generation via Language Model Arithmetic [7.687678490751105]
We introduce model arithmetic, a novel inference framework for composing and biasing Large Language Models.
We show that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction.
arXiv Detail & Related papers (2023-11-24T13:41:12Z) - Teaching the Pre-trained Model to Generate Simple Texts for Text
Simplification [59.625179404482594]
Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts.
We propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.
arXiv Detail & Related papers (2023-05-21T14:03:49Z) - TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models [18.49325959450621]
We introduce TextPruner, an open-source model pruning toolkit for pre-trained language models.
TextPruner offers structured post-training pruning methods, including vocabulary pruning and transformer pruning.
Our experiments with several NLP tasks demonstrate the ability of TextPruner to reduce the model size without re-training the model.
arXiv Detail & Related papers (2022-03-30T02:10:33Z) - Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
in Natural Language Processing [78.8500633981247]
This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning"
Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly.
arXiv Detail & Related papers (2021-07-28T18:09:46Z) - A Comprehensive Comparison of Pre-training Language Models [0.5139874302398955]
We pre-train a list of transformer-based models with the same amount of text and the same training steps.
The experimental results show that the most improvement upon the origin BERT is adding the RNN-layer to capture more contextual information for short text understanding.
arXiv Detail & Related papers (2021-06-22T02:12:29Z) - CharBERT: Character-aware Pre-trained Language Model [36.9333890698306]
We propose a character-aware pre-trained language model named CharBERT.
We first construct the contextual word embedding for each token from the sequential character representations.
We then fuse the representations of characters and the subword representations by a novel heterogeneous interaction module.
arXiv Detail & Related papers (2020-11-03T07:13:06Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification
Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English.
We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.