MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical
Simplification with Pretrained Encoders
- URL: http://arxiv.org/abs/2212.09855v1
- Date: Mon, 19 Dec 2022 20:57:45 GMT
- Title: MANTIS at TSAR-2022 Shared Task: Improved Unsupervised Lexical
Simplification with Pretrained Encoders
- Authors: Xiaofei Li, Daniel Wiechmann, Yu Qiao, Elma Kerz
- Abstract summary: We present our contribution to the TSAR-2022 Shared Task on Lexical Simplification of the EMNLP 2022 Workshop on Text Simplification, Accessibility, and Readability.
Our approach builds on and extends the unsupervised lexical simplification system with pretrained encoders (LSBert) system.
Our best-performing system improves LSBert by 5.9% accuracy and second place out of 33 ranked solutions.
- Score: 31.64341800095214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we present our contribution to the TSAR-2022 Shared Task on
Lexical Simplification of the EMNLP 2022 Workshop on Text Simplification,
Accessibility, and Readability. Our approach builds on and extends the
unsupervised lexical simplification system with pretrained encoders (LSBert)
system in the following ways: For the subtask of simplification candidate
selection, it utilizes a RoBERTa transformer language model and expands the
size of the generated candidate list. For subsequent substitution ranking, it
introduces a new feature weighting scheme and adopts a candidate filtering
method based on textual entailment to maximize semantic similarity between the
target word and its simplification. Our best-performing system improves LSBert
by 5.9% accuracy and achieves second place out of 33 ranked solutions.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - Sub-SA: Strengthen In-context Learning via Submodular Selective Annotation [4.846839863393725]
We propose Sub-SA (Submodular Selective ), a sub-module-based selective annotation method.
The aim of Sub-SA is to reduce annotation costs while improving the quality of in-context examples.
We also propose RPR (Reward and Penalty Regularization) to better balance the diversity and representativeness of the unlabeled dataset.
arXiv Detail & Related papers (2024-07-08T07:47:30Z) - An LLM-Enhanced Adversarial Editing System for Lexical Simplification [10.519804917399744]
Lexical Simplification aims to simplify text at the lexical level.
Existing methods rely heavily on annotated data.
We propose a novel LS method without parallel corpora.
arXiv Detail & Related papers (2024-02-22T17:04:30Z) - Lexical Simplification using multi level and modular approach [1.9559144041082446]
This paper explains the work done by our team "teamPN" for English sub task.
We created a modular pipeline which combines modern day transformers based models with traditional NLP methods.
arXiv Detail & Related papers (2023-02-03T15:57:54Z) - UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical
Simplification? [2.931632009516441]
We describe a pipeline based on prompted GPT-3 responses, beating competing approaches by a wide margin in settings with few training instances.
Applying to the Spanish and Portuguese subset, we achieve state-of-the-art results with only minor modification to the original prompts.
arXiv Detail & Related papers (2023-01-04T18:59:20Z) - Hierarchical Sketch Induction for Paraphrase Generation [79.87892048285819]
We introduce Hierarchical Refinement Quantized Variational Autoencoders (HRQ-VAE), a method for learning decompositions of dense encodings.
We use HRQ-VAE to encode the syntactic form of an input sentence as a path through the hierarchy, allowing us to more easily predict syntactic sketches at test time.
arXiv Detail & Related papers (2022-03-07T15:28:36Z) - Inducing Transformer's Compositional Generalization Ability via
Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions.
Existing neural models have been shown to lack this basic ability in learning symbolic structures.
We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z) - Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting [54.03356526990088]
We propose Sequence Span Rewriting (SSR) as a self-supervised sequence-to-sequence (seq2seq) pre-training objective.
SSR provides more fine-grained learning signals for text representations by supervising the model to rewrite imperfect spans to ground truth.
Our experiments with T5 models on various seq2seq tasks show that SSR can substantially improve seq2seq pre-training.
arXiv Detail & Related papers (2021-01-02T10:27:11Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - LSBert: A Simple Framework for Lexical Simplification [32.75631197427934]
We propose a lexical simplification framework LSBert based on pretrained representation model Bert.
We show that our system outputs lexical simplifications that are grammatically correct and semantically appropriate.
arXiv Detail & Related papers (2020-06-25T09:15:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.