Revision for Concision: A Constrained Paraphrase Generation Task
- URL: http://arxiv.org/abs/2210.14257v1
- Date: Tue, 25 Oct 2022 18:20:54 GMT
- Title: Revision for Concision: A Constrained Paraphrase Generation Task
- Authors: Wenchuan Mu and Kwan Hui Lim
- Abstract summary: Revising for concision is a natural language processing task at the sentence level.
Revising for concision requires algorithms to use only necessary words to rewrite a sentence.
We curate and make available a benchmark parallel dataset that can depict revising for concision.
- Score: 0.3121997724420106
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Academic writing should be concise as concise sentences better keep the
readers' attention and convey meaning clearly. Writing concisely is
challenging, for writers often struggle to revise their drafts. We introduce
and formulate revising for concision as a natural language processing task at
the sentence level. Revising for concision requires algorithms to use only
necessary words to rewrite a sentence while preserving its meaning. The revised
sentence should be evaluated according to its word choice, sentence structure,
and organization. The revised sentence also needs to fulfil semantic retention
and syntactic soundness. To aide these efforts, we curate and make available a
benchmark parallel dataset that can depict revising for concision. The dataset
contains 536 pairs of sentences before and after revising, and all pairs are
collected from college writing centres. We also present and evaluate the
approaches to this problem, which may assist researchers in this area.
Related papers
- CorpusStudio: Surfacing Emergent Patterns in a Corpus of Prior Work while Writing [30.18692324895119]
Many communities, including the scientific community, develop implicit writing norms.
It is difficult to both externalize this knowledge and apply it to one's own writing.
We propose two new writing support concepts that reify document and sentence-level patterns in a given text corpus.
arXiv Detail & Related papers (2025-03-16T10:16:21Z) - Analysing Zero-Shot Readability-Controlled Sentence Simplification [54.09069745799918]
We investigate how different types of contextual information affect a model's ability to generate sentences with the desired readability.
Results show that all tested models struggle to simplify sentences due to models' limitations and characteristics of the source sentences.
Our experiments also highlight the need for better automatic evaluation metrics tailored to RCTS.
arXiv Detail & Related papers (2024-09-30T12:36:25Z) - Fine-grained Controllable Text Generation through In-context Learning with Feedback [57.396980277089135]
We present a method for rewriting an input sentence to match specific values of nontrivial linguistic features, such as dependency depth.
In contrast to earlier work, our method uses in-context learning rather than finetuning, making it applicable in use cases where data is sparse.
arXiv Detail & Related papers (2024-06-17T08:55:48Z) - To Revise or Not to Revise: Learning to Detect Improvable Claims for
Argumentative Writing Support [20.905660642919052]
We explore the main challenges to identifying argumentative claims in need of specific revisions.
We propose a new sampling strategy based on revision distance.
We provide evidence that using contextual information and domain knowledge can further improve prediction results.
arXiv Detail & Related papers (2023-05-26T10:19:54Z) - Conjunct Resolution in the Face of Verbal Omissions [51.220650412095665]
We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
arXiv Detail & Related papers (2023-05-26T08:44:02Z) - PropSegmEnt: A Large-Scale Corpus for Proposition-Level Segmentation and
Entailment Recognition [63.51569687229681]
We argue for the need to recognize the textual entailment relation of each proposition in a sentence individually.
We propose PropSegmEnt, a corpus of over 45K propositions annotated by expert human raters.
Our dataset structure resembles the tasks of (1) segmenting sentences within a document to the set of propositions, and (2) classifying the entailment relation of each proposition with respect to a different yet topically-aligned document.
arXiv Detail & Related papers (2022-12-21T04:03:33Z) - SNaC: Coherence Error Detection for Narrative Summarization [73.48220043216087]
We introduce SNaC, a narrative coherence evaluation framework rooted in fine-grained annotations for long summaries.
We develop a taxonomy of coherence errors in generated narrative summaries and collect span-level annotations for 6.6k sentences across 150 book and movie screenplay summaries.
Our work provides the first characterization of coherence errors generated by state-of-the-art summarization models and a protocol for eliciting coherence judgments from crowd annotators.
arXiv Detail & Related papers (2022-05-19T16:01:47Z) - A Dataset for Discourse Structure in Peer Review Discussions [33.621647816641925]
We show that discourse cues from rebuttals can shed light on the quality and interpretation of reviews.
This paper presents a new labeled dataset of 20k sentences contained in 506 review-rebuttal pairs in English, annotated by experts.
arXiv Detail & Related papers (2021-10-16T09:18:12Z) - Reformulating Sentence Ordering as Conditional Text Generation [17.91448517871621]
We present Reorder-BART (RE-BART), a sentence ordering framework.
We reformulate the task as a conditional text-to-marker generation setup.
Our framework achieves the state-of-the-art performance across six datasets in Perfect Match Ratio (PMR) and Kendall's tau ($tau$) metric.
arXiv Detail & Related papers (2021-04-14T18:16:47Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z) - Rewriting Meaningful Sentences via Conditional BERT Sampling and an
application on fooling text classifiers [11.49508308643065]
adversarial attack methods that are designed to deceive a text classifier change the text classifier's prediction by modifying a few words or characters.
Few try to attack classifiers by rewriting a whole sentence, due to the difficulties inherent in sentence-level rephrasing as well as the problem of setting the criteria for legitimate rewriting.
In this paper, we explore the problem of creating adversarial examples with sentence-level rewriting.
We propose a new criteria for modification, called a sentence-level threaten model. This criteria allows for both word- and sentence-level changes, and can be adjusted independently in two dimensions: semantic similarity and
arXiv Detail & Related papers (2020-10-22T17:03:13Z) - Understanding Points of Correspondence between Sentences for Abstractive
Summarization [39.7404761923196]
We present an investigation into fusing sentences drawn from a document by introducing the notion of points of correspondence.
We create a dataset containing the documents, source and fusion sentences, and human annotations of points of correspondence between sentences.
arXiv Detail & Related papers (2020-06-10T02:42:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.