Enhancing Pre-trained Language Model with Lexical Simplification
- URL: http://arxiv.org/abs/2012.15070v1
- Date: Wed, 30 Dec 2020 07:49:00 GMT
- Title: Enhancing Pre-trained Language Model with Lexical Simplification
- Authors: Rongzhou Bao, Jiayi Wang, Zhuosheng Zhang, Hai Zhao
- Abstract summary: lexical simplification (LS) is a recognized method to reduce such lexical diversity.
We propose a novel approach which can effectively improve the performance of PrLMs in text classification.
- Score: 41.34550924004487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For both human readers and pre-trained language models (PrLMs), lexical
diversity may lead to confusion and inaccuracy when understanding the
underlying semantic meanings of given sentences. By substituting complex words
with simple alternatives, lexical simplification (LS) is a recognized method to
reduce such lexical diversity, and therefore to improve the understandability
of sentences. In this paper, we leverage LS and propose a novel approach which
can effectively improve the performance of PrLMs in text classification. A
rule-based simplification process is applied to a given sentence. PrLMs are
encouraged to predict the real label of the given sentence with auxiliary
inputs from the simplified version. Using strong PrLMs (BERT and ELECTRA) as
baselines, our approach can still further improve the performance in various
text classification tasks.
Related papers
- SentenceVAE: Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context [49.9628075245959]
We present Sentence Variational Autoencoder (SentenceVAE), which includes a Sentence to compress multiple tokens in a sentence into a single token, and a Sentence Decoder to reconstruct it.
The proposed method can accelerate inference speed by 204365%, reduce perplexity (PPL) to 4675% of its original metric, and decrease memory overhead by 8691% for the equivalent context length.
arXiv Detail & Related papers (2024-08-01T15:45:19Z) - An LLM-Enhanced Adversarial Editing System for Lexical Simplification [10.519804917399744]
Lexical Simplification aims to simplify text at the lexical level.
Existing methods rely heavily on annotated data.
We propose a novel LS method without parallel corpora.
arXiv Detail & Related papers (2024-02-22T17:04:30Z) - Improving Factual Consistency of Text Summarization by Adversarially
Decoupling Comprehension and Embellishment Abilities of LLMs [67.56087611675606]
Large language models (LLMs) generate summaries that are factually inconsistent with original articles.
These hallucinations are challenging to detect through traditional methods.
We propose an adversarially DEcoupling method to disentangle the abilities of LLMs (DECENT)
arXiv Detail & Related papers (2023-10-30T08:40:16Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Sentence Simplification via Large Language Models [15.07021692249856]
Sentence Simplification aims to rephrase complex sentences into simpler sentences while retaining original meaning.
Large Language models (LLMs) have demonstrated the ability to perform a variety of natural language processing tasks.
arXiv Detail & Related papers (2023-02-23T12:11:58Z) - Sentence Representation Learning with Generative Objective rather than
Contrastive Objective [86.01683892956144]
We propose a novel generative self-supervised learning objective based on phrase reconstruction.
Our generative learning achieves powerful enough performance improvement and outperforms the current state-of-the-art contrastive methods.
arXiv Detail & Related papers (2022-10-16T07:47:46Z) - Span Fine-tuning for Pre-trained Language Models [43.352833140317486]
This paper presents a novel span fine-tuning method for PrLMs.
Any sentences processed by the PrLM will be segmented into multiple spans according to a pre-sampled dictionary.
Experiments on GLUE benchmark show that the proposed span fine-tuning method significantly enhances the PrLM.
arXiv Detail & Related papers (2021-08-29T14:11:38Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - Chinese Lexical Simplification [29.464388721085548]
There is no research work for Chinese lexical simplification ( CLS) task.
To circumvent difficulties in acquiring annotations, we manually create the first benchmark dataset for CLS.
We present five different types of methods as baselines to generate substitute candidates for the complex word.
arXiv Detail & Related papers (2020-10-14T12:55:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.