LSBert: A Simple Framework for Lexical Simplification
- URL: http://arxiv.org/abs/2006.14939v1
- Date: Thu, 25 Jun 2020 09:15:42 GMT
- Title: LSBert: A Simple Framework for Lexical Simplification
- Authors: Jipeng Qiang and Yun Li and Yi Zhu and Yunhao Yuan and Xindong Wu
- Abstract summary: We propose a lexical simplification framework LSBert based on pretrained representation model Bert.
We show that our system outputs lexical simplifications that are grammatically correct and semantically appropriate.
- Score: 32.75631197427934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lexical simplification (LS) aims to replace complex words in a given sentence
with their simpler alternatives of equivalent meaning, to simplify the
sentence. Recently unsupervised lexical simplification approaches only rely on
the complex word itself regardless of the given sentence to generate candidate
substitutions, which will inevitably produce a large number of spurious
candidates. In this paper, we propose a lexical simplification framework LSBert
based on pretrained representation model Bert, that is capable of (1) making
use of the wider context when both detecting the words in need of
simplification and generating substitue candidates, and (2) taking five
high-quality features into account for ranking candidates, including Bert
prediction order, Bert-based language model, and the paraphrase database PPDB,
in addition to the word frequency and word similarity commonly used in other LS
methods. We show that our system outputs lexical simplifications that are
grammatically correct and semantically appropriate, and obtains obvious
improvement compared with these baselines, outperforming the state-of-the-art
by 29.8 Accuracy points on three well-known benchmarks.
Related papers
- SimpLex: a lexical text simplification architecture [0.5156484100374059]
We present textscSimpLex, a novel simplification architecture for generating simplified English sentences.
The proposed architecture uses either word embeddings (i.e., Word2Vec) and perplexity, or sentence transformers (i.e., BERT, RoBERTa, and GPT2) and cosine similarity.
The solution is incorporated into a user-friendly and simple-to-use software.
arXiv Detail & Related papers (2023-04-14T08:52:31Z) - Sentence Simplification via Large Language Models [15.07021692249856]
Sentence Simplification aims to rephrase complex sentences into simpler sentences while retaining original meaning.
Large Language models (LLMs) have demonstrated the ability to perform a variety of natural language processing tasks.
arXiv Detail & Related papers (2023-02-23T12:11:58Z) - NapSS: Paragraph-level Medical Text Simplification via Narrative
Prompting and Sentence-matching Summarization [46.772517928718216]
We propose a summarize-then-simplify two-stage strategy, which we call NapSS.
NapSS identifies the relevant content to simplify while ensuring that the original narrative flow is preserved.
Our model achieves significantly better than the seq2seq baseline on an English medical corpus.
arXiv Detail & Related papers (2023-02-11T02:20:25Z) - MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical
Simplification with Pretrained Encoders [31.64341800095214]
We present our contribution to the TSAR-2022 Shared Task on Lexical Simplification of the EMNLP 2022 Workshop on Text Simplification, Accessibility, and Readability.
Our approach builds on and extends the unsupervised lexical simplification system with pretrained encoders (LSBert) system.
Our best-performing system improves LSBert by 5.9% accuracy and second place out of 33 ranked solutions.
arXiv Detail & Related papers (2022-12-19T20:57:45Z) - SimpleBERT: A Pre-trained Model That Learns to Generate Simple Words [59.142185753887645]
In this work, we propose a continued pre-training method for text simplification.
We use a small-scale simple text dataset for continued pre-training and employ two methods to identify simple words.
We obtain SimpleBERT, which surpasses BERT in both lexical simplification and sentence simplification tasks.
arXiv Detail & Related papers (2022-04-16T11:28:01Z) - Short-Term Word-Learning in a Dynamically Changing Environment [63.025297637716534]
We show how to supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
We demonstrate significant improvements in the detection rate of new words with only a minor increase in false alarms.
arXiv Detail & Related papers (2022-03-29T10:05:39Z) - LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution [76.615287796753]
We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models.
This is achieved by combining contextual information with knowledge from structured lexical resources.
Our experiments show that LexSubCon outperforms previous state-of-the-art methods on LS07 and CoInCo benchmark datasets.
arXiv Detail & Related papers (2021-07-11T21:25:56Z) - Enhancing Pre-trained Language Model with Lexical Simplification [41.34550924004487]
lexical simplification (LS) is a recognized method to reduce such lexical diversity.
We propose a novel approach which can effectively improve the performance of PrLMs in text classification.
arXiv Detail & Related papers (2020-12-30T07:49:00Z) - Chinese Lexical Simplification [29.464388721085548]
There is no research work for Chinese lexical simplification ( CLS) task.
To circumvent difficulties in acquiring annotations, we manually create the first benchmark dataset for CLS.
We present five different types of methods as baselines to generate substitute candidates for the complex word.
arXiv Detail & Related papers (2020-10-14T12:55:36Z) - ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification
Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English.
We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.