Discourse-Aware Text Simplification: From Complex Sentences to Linked
Propositions
- URL: http://arxiv.org/abs/2308.00425v1
- Date: Tue, 1 Aug 2023 10:10:59 GMT
- Title: Discourse-Aware Text Simplification: From Complex Sentences to Linked
Propositions
- Authors: Christina Niklaus, Matthias Cetto, Andr\'e Freitas, Siegfried
Handschuh
- Abstract summary: Text Simplification (TS) aims to modify sentences in order to make them easier to process.
We present a discourse-aware TS approach that splits and rephrases complex English sentences.
We generate a semantic hierarchy of minimal propositions that puts a semantic layer on top of the simplified sentences.
- Score: 11.335080241393191
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentences that present a complex syntax act as a major stumbling block for
downstream Natural Language Processing applications whose predictive quality
deteriorates with sentence length and complexity. The task of Text
Simplification (TS) may remedy this situation. It aims to modify sentences in
order to make them easier to process, using a set of rewriting operations, such
as reordering, deletion, or splitting. State-of-the-art syntactic TS approaches
suffer from two major drawbacks: first, they follow a very conservative
approach in that they tend to retain the input rather than transforming it, and
second, they ignore the cohesive nature of texts, where context spread across
clauses or sentences is needed to infer the true meaning of a statement. To
address these problems, we present a discourse-aware TS approach that splits
and rephrases complex English sentences within the semantic context in which
they occur. Based on a linguistically grounded transformation stage that uses
clausal and phrasal disembedding mechanisms, complex sentences are transformed
into shorter utterances with a simple canonical structure that can be easily
analyzed by downstream applications. With sentence splitting, we thus address a
TS task that has hardly been explored so far. Moreover, we introduce the notion
of minimality in this context, as we aim to decompose source sentences into a
set of self-contained minimal semantic units. To avoid breaking down the input
into a disjointed sequence of statements that is difficult to interpret because
important contextual information is missing, we incorporate the semantic
context between the split propositions in the form of hierarchical structures
and semantic relationships. In that way, we generate a semantic hierarchy of
minimal propositions that leads to a novel representation of complex assertions
that puts a semantic layer on top of the simplified sentences.
Related papers
- A Unified View on Forgetting and Strong Equivalence Notions in Answer
Set Programming [14.342696862884704]
We introduce a novel relativized equivalence notion, which is able to capture all related notions from the literature.
We then introduce an operator that combines projection and a relaxation of (SP)-forgetting to obtain the relativized simplifications.
arXiv Detail & Related papers (2023-12-13T09:05:48Z) - Conjunct Resolution in the Face of Verbal Omissions [51.220650412095665]
We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
arXiv Detail & Related papers (2023-05-26T08:44:02Z) - Bridging Continuous and Discrete Spaces: Interpretable Sentence
Representation Learning via Compositional Operations [80.45474362071236]
It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space.
We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings.
arXiv Detail & Related papers (2023-05-24T00:44:49Z) - Elaborative Simplification as Implicit Questions Under Discussion [51.17933943734872]
This paper proposes to view elaborative simplification through the lens of the Question Under Discussion (QUD) framework.
We show that explicitly modeling QUD provides essential understanding of elaborative simplification and how the elaborations connect with the rest of the discourse.
arXiv Detail & Related papers (2023-05-17T17:26:16Z) - Syntactic Complexity Identification, Measurement, and Reduction Through
Controlled Syntactic Simplification [0.0]
We present a classical syntactic dependency-based approach to split and rephrase a compound and complex sentence into a set of simplified sentences.
The paper also introduces an algorithm to identify and measure a sentence's syntactic complexity.
This work is accepted and presented in International workshop on Learning with Knowledge Graphs (IWLKG) at WSDM-2023 Conference.
arXiv Detail & Related papers (2023-04-16T13:13:58Z) - Context-Preserving Text Simplification [11.830061911323025]
We present a context-preserving text simplification (TS) approach that splits and rephrases complex English sentences into a semantic hierarchy of simplified sentences.
Using a set of linguistically principled transformation patterns, input sentences are converted into a hierarchical representation in the form of core sentences and accompanying contexts that are linked via rhetorical relations.
A comparative analysis with the annotations contained in the RST-DT shows that we are able to capture the contextual hierarchy between the split sentences with a precision of 89% and reach an average precision of 69% for the classification of the rhetorical relations that hold between them.
arXiv Detail & Related papers (2021-05-24T09:54:56Z) - Controllable Text Simplification with Explicit Paraphrasing [88.02804405275785]
Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting.
Current simplification systems are predominantly sequence-to-sequence models that are trained end-to-end to perform all these operations simultaneously.
We propose a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles.
arXiv Detail & Related papers (2020-10-21T13:44:40Z) - XTE: Explainable Text Entailment [8.036150169408241]
Entailment is the task of determining whether a piece of text logically follows from another piece of text.
XTE - Explainable Text Entailment - is a novel composite approach for recognizing text entailment.
arXiv Detail & Related papers (2020-09-25T20:49:07Z) - Explainable Prediction of Text Complexity: The Missing Preliminaries for
Text Simplification [13.447565774887215]
Text simplification reduces the language complexity of professional content for accessibility purposes.
End-to-end neural network models have been widely adopted to directly generate the simplified version of input text.
We show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process.
arXiv Detail & Related papers (2020-07-31T03:33:37Z) - ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification
Models with Multiple Rewriting Transformations [97.27005783856285]
This paper introduces ASSET, a new dataset for assessing sentence simplification in English.
We show that simplifications in ASSET are better at capturing characteristics of simplicity when compared to other standard evaluation datasets for the task.
arXiv Detail & Related papers (2020-05-01T16:44:54Z) - Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation [49.671882751569534]
We develop SynQG, a set of transparent syntactic rules which transform declarative sentences into question-answer pairs.
We utilize PropBank argument descriptions and VerbNet state predicates to incorporate shallow semantic content.
In order to improve syntactic fluency and eliminate grammatically incorrect questions, we employ back-translation over the output of these syntactic rules.
arXiv Detail & Related papers (2020-04-18T19:57:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.