Related papers: Lexical Simplification using multi level and modular approach

Lexical Simplification using multi level and modular approach

URL: http://arxiv.org/abs/2302.01823v1
Date: Fri, 3 Feb 2023 15:57:54 GMT
Title: Lexical Simplification using multi level and modular approach
Authors: Nikita Katyal, Pawan Kumar Rajpoot
Abstract summary: This paper explains the work done by our team "teamPN" for English sub task. We created a modular pipeline which combines modern day transformers based models with traditional NLP methods.
Score: 1.9559144041082446
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text Simplification is an ongoing problem in Natural Language Processing, solution to which has varied implications. In conjunction with the TSAR-2022 Workshop @EMNLP2022 Lexical Simplification is the process of reducing the lexical complexity of a text by replacing difficult words with easier to read (or understand) expressions while preserving the original information and meaning. This paper explains the work done by our team "teamPN" for English sub task. We created a modular pipeline which combines modern day transformers based models with traditional NLP methods like paraphrasing and verb sense disambiguation. We created a multi level and modular pipeline where the target text is treated according to its semantics(Part of Speech Tag). Pipeline is multi level as we utilize multiple source models to find potential candidates for replacement, It is modular as we can switch the source models and their weight-age in the final re-ranking.

Related papers

External Knowledge Augmented Polyphone Disambiguation Using Large Language Model [3.372242769313867]
We introduce a novel method to solve the problem as a generation task. Retrieval module incorporates external knowledge which is a multi-level semantic dictionary of Chinese polyphonic characters. Generation module adopts the decoder-only Transformer architecture to induce the target text. Postprocess module corrects the generated text into a valid result if needed.
arXiv Detail & Related papers (2023-12-19T08:00:10Z)
Multilingual Lexical Simplification via Paraphrase Generation [19.275642346073557]
We propose a novel multilingual LS method via paraphrase generation. We regard paraphrasing as a zero-shot translation task within multilingual neural machine translation. Our approach surpasses BERT-based methods and zero-shot GPT3-based method significantly on English, Spanish, and Portuguese.
arXiv Detail & Related papers (2023-07-28T03:47:44Z)
On Conditional and Compositional Language Model Differentiable Prompting [75.76546041094436]
Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks. We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts.
arXiv Detail & Related papers (2023-07-04T02:47:42Z)
SimpLex: a lexical text simplification architecture [0.5156484100374059]
We present textscSimpLex, a novel simplification architecture for generating simplified English sentences. The proposed architecture uses either word embeddings (i.e., Word2Vec) and perplexity, or sentence transformers (i.e., BERT, RoBERTa, and GPT2) and cosine similarity. The solution is incorporated into a user-friendly and simple-to-use software.
arXiv Detail & Related papers (2023-04-14T08:52:31Z)
UniHD at TSAR-2022 Shared Task: Is Compute All We Need for Lexical Simplification? [2.931632009516441]
We describe a pipeline based on prompted GPT-3 responses, beating competing approaches by a wide margin in settings with few training instances. Applying to the Spanish and Portuguese subset, we achieve state-of-the-art results with only minor modification to the original prompts.
arXiv Detail & Related papers (2023-01-04T18:59:20Z)
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models [69.35569554213679]
In Spoken Language Understanding (SLU) the task is to extract important information from audio commands. This paper presents a simple method for embedding intents and entities into Finite State Transducers.
arXiv Detail & Related papers (2022-06-29T12:49:53Z)
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization [50.16128796194463]
We propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. We introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters. We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level.
arXiv Detail & Related papers (2021-06-23T22:24:14Z)
SML: a new Semantic Embedding Alignment Transformer for efficient cross-lingual Natural Language Inference [71.57324258813674]
The ability of Transformers to perform with precision a variety of tasks such as question answering, Natural Language Inference (NLI) or summarising, have enable them to be ranked as one of the best paradigms to address this kind of tasks at present. NLI is one of the best scenarios to test these architectures, due to the knowledge required to understand complex sentences and established a relation between a hypothesis and a premise. In this paper, we propose a new architecture, siamese multilingual transformer, to efficiently align multilingual embeddings for Natural Language Inference.
arXiv Detail & Related papers (2021-03-17T13:23:53Z)
Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously. We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z)
Deep Transformer based Data Augmentation with Subword Units for Morphologically Rich Online ASR [0.0]
Deep Transformer models have proven to be particularly powerful in language modeling tasks for ASR. Recent studies showed that a considerable part of the knowledge of neural network Language Models (LM) can be transferred to traditional n-grams by using neural text generation based data augmentation. We show that although data augmentation with Transformer-generated text works well for isolating languages, it causes a vocabulary explosion in a morphologically rich language. We propose a new method called subword-based neural text augmentation, where we retokenize the generated text into statistically derived subwords.
arXiv Detail & Related papers (2020-07-14T10:22:05Z)
Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model. First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model. Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.