Unsupervised Sentence Simplification via Dependency Parsing
- URL: http://arxiv.org/abs/2206.12261v1
- Date: Fri, 10 Jun 2022 07:55:25 GMT
- Title: Unsupervised Sentence Simplification via Dependency Parsing
- Authors: Vy Vo, Weiqing Wang and Wray Buntine
- Abstract summary: We propose a simple yet novel unsupervised sentence simplification system.
It harnesses parsing structures together with sentence embeddings to produce linguistically effective simplifications.
We establish the unsupervised state-of-the-art at 39.13 SARI on TurkCorpus set and perform competitively against supervised baselines on various quality metrics.
- Score: 4.337513096197002
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text simplification is the task of rewriting a text so that it is readable
and easily understood. In this paper, we propose a simple yet novel
unsupervised sentence simplification system that harnesses parsing structures
together with sentence embeddings to produce linguistically effective
simplifications. This means our model is capable of introducing substantial
modifications to simplify a sentence while maintaining its original semantics
and adequate fluency. We establish the unsupervised state-of-the-art at 39.13
SARI on TurkCorpus set and perform competitively against supervised baselines
on various quality metrics. Furthermore, we demonstrate our framework's
extensibility to other languages via a proof-of-concept on Vietnamese data.
Code for reproduction is published at \url{https://github.com/isVy08/USDP}.
Related papers
- A New Dataset and Empirical Study for Sentence Simplification in Chinese [50.0624778757462]
This paper introduces CSS, a new dataset for assessing sentence simplification in Chinese.
We collect manual simplifications from human annotators and perform data analysis to show the difference between English and Chinese sentence simplifications.
In the end, we explore whether Large Language Models can serve as high-quality Chinese sentence simplification systems by evaluating them on CSS.
arXiv Detail & Related papers (2023-06-07T06:47:34Z) - Elaborative Simplification as Implicit Questions Under Discussion [51.17933943734872]
This paper proposes to view elaborative simplification through the lens of the Question Under Discussion (QUD) framework.
We show that explicitly modeling QUD provides essential understanding of elaborative simplification and how the elaborations connect with the rest of the discourse.
arXiv Detail & Related papers (2023-05-17T17:26:16Z) - Context-Aware Document Simplification [3.2880869992413237]
We explore systems that use document context within the simplification process itself.
We achieve state-of-the-art performance on the document simplification task, even when not relying on plan-guidance.
arXiv Detail & Related papers (2023-05-10T16:06:36Z) - Syntactic Complexity Identification, Measurement, and Reduction Through
Controlled Syntactic Simplification [0.0]
We present a classical syntactic dependency-based approach to split and rephrase a compound and complex sentence into a set of simplified sentences.
The paper also introduces an algorithm to identify and measure a sentence's syntactic complexity.
This work is accepted and presented in International workshop on Learning with Knowledge Graphs (IWLKG) at WSDM-2023 Conference.
arXiv Detail & Related papers (2023-04-16T13:13:58Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Automatic Lexical Simplification for Turkish [0.0]
We present the first automatic lexical simplification system for the Turkish language.
Recent text simplification efforts rely on manually crafted simplified corpora and comprehensive NLP tools.
We present a new text simplification pipeline based on pretrained representation model BERT together with morphological features to generate grammatically correct and semantically appropriate word-level simplifications.
arXiv Detail & Related papers (2022-01-15T15:58:44Z) - Text Simplification for Comprehension-based Question-Answering [7.144235435987265]
We release Simple-SQuAD, a simplified version of the widely-used SQuAD dataset.
We benchmark the newly created corpus and perform an ablation study for examining the effect of the simplification process in the SQuAD-based question answering task.
arXiv Detail & Related papers (2021-09-28T18:48:00Z) - Dependency Induction Through the Lens of Visual Perception [81.91502968815746]
We propose an unsupervised grammar induction model that leverages word concreteness and a structural vision-based to jointly learn constituency-structure and dependency-structure grammars.
Our experiments show that the proposed extension outperforms the current state-of-the-art visually grounded models in constituency parsing even with a smaller grammar size.
arXiv Detail & Related papers (2021-09-20T18:40:37Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - Elaborative Simplification: Content Addition and Explanation Generation
in Text Simplification [33.08519864889526]
We present the first data-driven study of content addition in text simplification.
We analyze how entities, ideas, and concepts are elaborated through the lens of contextual specificity.
Our results illustrate the complexities of elaborative simplification, suggesting many interesting directions for future work.
arXiv Detail & Related papers (2020-10-20T05:06:23Z) - ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification
Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English.
We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.