Simple-QE: Better Automatic Quality Estimation for Text Simplification
- URL: http://arxiv.org/abs/2012.12382v1
- Date: Tue, 22 Dec 2020 22:02:37 GMT
- Title: Simple-QE: Better Automatic Quality Estimation for Text Simplification
- Authors: Reno Kriz, Marianna Apidianaki, Chris Callison-Burch
- Abstract summary: We propose Simple-QE, a BERT-based quality estimation (QE) model adapted from prior summarization QE work.
We show that Simple-QE correlates well with human quality judgments.
We also show that we can adapt this approach to accurately predict the complexity of human-written texts.
- Score: 22.222195626377907
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Text simplification systems generate versions of texts that are easier to
understand for a broader audience. The quality of simplified texts is generally
estimated using metrics that compare to human references, which can be
difficult to obtain. We propose Simple-QE, a BERT-based quality estimation (QE)
model adapted from prior summarization QE work, and show that it correlates
well with human quality judgments. Simple-QE does not require human references,
which makes the model useful in a practical setting where users would need to
be informed about the quality of generated simplifications. We also show that
we can adapt this approach to accurately predict the complexity of
human-written texts.
Related papers
- Localizing Factual Inconsistencies in Attributable Text Generation [91.981439746404]
We introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation.
We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation.
We then implement several methods for automatically detecting localized factual inconsistencies.
arXiv Detail & Related papers (2024-10-09T22:53:48Z) - Analysing Zero-Shot Readability-Controlled Sentence Simplification [54.09069745799918]
We investigate how different types of contextual information affect a model's ability to generate sentences with the desired readability.
Results show that all tested models struggle to simplify sentences due to models' limitations and characteristics of the source sentences.
Our experiments also highlight the need for better automatic evaluation metrics tailored to RCTS.
arXiv Detail & Related papers (2024-09-30T12:36:25Z) - PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models [72.57329554067195]
ProxyQA is an innovative framework dedicated to assessing longtext generation.
It comprises in-depth human-curated meta-questions spanning various domains, each accompanied by specific proxy-questions with pre-annotated answers.
It assesses the generated content's quality through the evaluator's accuracy in addressing the proxy-questions.
arXiv Detail & Related papers (2024-01-26T18:12:25Z) - Do Text Simplification Systems Preserve Meaning? A Human Evaluation via
Reading Comprehension [22.154454849167077]
We introduce a human evaluation framework to assess whether simplified texts preserve meaning using reading comprehension questions.
We conduct a thorough human evaluation of texts by humans and by nine automatic systems.
arXiv Detail & Related papers (2023-12-15T14:26:06Z) - Evaluation Metrics of Language Generation Models for Synthetic Traffic
Generation Tasks [22.629816738693254]
We show that common NLG metrics, like BLEU, are not suitable for evaluating Synthetic Traffic Generation (STG)
We propose and evaluate several metrics designed to compare the generated traffic to the distribution of real user texts.
arXiv Detail & Related papers (2023-11-21T11:26:26Z) - Simplicity Level Estimate (SLE): A Learned Reference-Less Metric for
Sentence Simplification [8.479659578608233]
We propose a new learned evaluation metric (SLE) for sentence simplification.
SLE focuses on simplicity, outperforming almost all existing metrics in terms of correlation with human judgements.
arXiv Detail & Related papers (2023-10-12T09:49:10Z) - LENS: A Learnable Evaluation Metric for Text Simplification [17.48383068498169]
We present LENS, a learnable evaluation metric for text simplification.
We also introduce Rank and Rate, a human evaluation framework that rates simplifications from several models in a list-wise manner.
arXiv Detail & Related papers (2022-12-19T18:56:52Z) - Classifiers are Better Experts for Controllable Text Generation [63.17266060165098]
We show that the proposed method significantly outperforms recent PPLM, GeDi, and DExperts on PPL and sentiment accuracy based on the external classifier of generated texts.
The same time, it is also easier to implement and tune, and has significantly fewer restrictions and requirements.
arXiv Detail & Related papers (2022-05-15T12:58:35Z) - Evaluating Factuality in Text Simplification [43.94402649899681]
We introduce a taxonomy of errors that we use to analyze both references drawn from standard simplification datasets and state-of-the-art model outputs.
We find that errors often appear in both that are not captured by existing evaluation metrics.
arXiv Detail & Related papers (2022-04-15T17:37:09Z) - Document-Level Text Simplification: Dataset, Criteria and Baseline [75.58761130635824]
We define and investigate a new task of document-level text simplification.
Based on Wikipedia dumps, we first construct a large-scale dataset named D-Wikipedia.
We propose a new automatic evaluation metric called D-SARI that is more suitable for the document-level simplification task.
arXiv Detail & Related papers (2021-10-11T08:15:31Z) - Towards Question-Answering as an Automatic Metric for Evaluating the
Content Quality of a Summary [65.37544133256499]
We propose a metric to evaluate the content quality of a summary using question-answering (QA)
We demonstrate the experimental benefits of QA-based metrics through an analysis of our proposed metric, QAEval.
arXiv Detail & Related papers (2020-10-01T15:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.