Eye Tracking Based Cognitive Evaluation of Automatic Readability Assessment Measures
- URL: http://arxiv.org/abs/2502.11150v2
- Date: Mon, 19 May 2025 14:16:00 GMT
- Title: Eye Tracking Based Cognitive Evaluation of Automatic Readability Assessment Measures
- Authors: Keren Gruteke Klein, Shachar Frenkel, Omer Shubi, Yevgeni Berzak,
- Abstract summary: We propose an eye tracking-based cognitive framework which taps into a key aspect of readability: reading ease.<n>We use this framework for evaluating a broad range of prominent readability measures, including two systems widely used in education.<n>Our analyses suggest that existing readability measures are poor predictors of reading facilitation and reading ease, outperformed by word properties commonly used in psycholinguistics.
- Score: 1.2062053320259833
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated text readability prediction is widely used in many real-world scenarios. Over the past century, such measures have primarily been developed and evaluated on reading comprehension outcomes and on human annotations of text readability levels. In this work, we propose an alternative, eye tracking-based cognitive framework which directly taps into a key aspect of readability: reading ease. We use this framework for evaluating a broad range of prominent readability measures, including two systems widely used in education, by quantifying their ability to account for reading facilitation effects in text simplification, as well as text reading ease more broadly. Our analyses suggest that existing readability measures are poor predictors of reading facilitation and reading ease, outperformed by word properties commonly used in psycholinguistics, and in particular by surprisal.
Related papers
- A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution [57.309390098903]
Authorship attribution aims to identify the origin or author of a document.
Large Language Models (LLMs) with their deep reasoning capabilities and ability to maintain long-range textual associations offer a promising alternative.
Our results on the IMDb and blog datasets show an impressive 85% accuracy in one-shot authorship classification across ten authors.
arXiv Detail & Related papers (2024-10-29T04:14:23Z) - Analysing Zero-Shot Readability-Controlled Sentence Simplification [54.09069745799918]
We investigate how different types of contextual information affect a model's ability to generate sentences with the desired readability.<n>Results show that all tested models struggle to simplify sentences due to models' limitations and characteristics of the source sentences.<n>Our experiments also highlight the need for better automatic evaluation metrics tailored to RCTS.
arXiv Detail & Related papers (2024-09-30T12:36:25Z) - Free-text Rationale Generation under Readability Level Control [6.338124510580766]
We investigate how large language models (LLMs) perform rationale generation under the effects of readability level control.
We find that explanations are adaptable to such instruction, though the requested readability is often misaligned with the measured text complexity.
Our human annotators confirm a generally satisfactory impression on rationales at all readability levels.
arXiv Detail & Related papers (2024-07-01T15:34:17Z) - Attention-aware semantic relevance predicting Chinese sentence reading [6.294658916880712]
This study proposes an attention-aware'' approach for computing contextual semantic relevance.
The attention-aware metrics of semantic relevance can more accurately predict fixation durations in Chinese reading tasks.
Our approach underscores the potential of these metrics to advance our comprehension of how humans understand and process language.
arXiv Detail & Related papers (2024-03-27T13:22:38Z) - Digital Comprehensibility Assessment of Simplified Texts among Persons
with Intellectual Disabilities [2.446971913303003]
We conducted an evaluation study of text comprehensibility including participants with and without intellectual disabilities reading German texts on a tablet computer.
We explored four different approaches to measuring comprehensibility: multiple-choice comprehension questions, perceived difficulty ratings, response time, and reading speed.
For the target group of persons with intellectual disabilities, comprehension questions emerged as the most reliable measure, while analyzing reading speed provided valuable insights into participants' reading behavior.
arXiv Detail & Related papers (2024-02-20T15:37:08Z) - Previously on the Stories: Recap Snippet Identification for Story
Reading [51.641565531840186]
We propose the first benchmark on this useful task called Recap Snippet Identification with a hand-crafted evaluation dataset.
Our experiments show that the proposed task is challenging to PLMs, LLMs, and proposed methods as the task requires a deep understanding of the plot correlation between snippets.
arXiv Detail & Related papers (2024-02-11T18:27:14Z) - Improving Factual Consistency of News Summarization by Contrastive Preference Optimization [65.11227166319546]
Large language models (LLMs) generate summaries that are factually inconsistent with original articles.<n>These hallucinations are challenging to detect through traditional methods.<n>We propose Contrastive Preference Optimization (CPO) to disentangle the LLMs' propensities to generate faithful and fake content.
arXiv Detail & Related papers (2023-10-30T08:40:16Z) - Information Value: Measuring Utterance Predictability as Distance from
Plausible Alternatives [4.446323294830542]
We present information value, a measure which quantifies the predictability of an utterance relative to a set of plausible alternatives.
We exploit their psychometric predictive power to investigate the dimensions of predictability that drive human comprehension behaviour.
arXiv Detail & Related papers (2023-10-20T17:25:36Z) - Generating Summaries with Controllable Readability Levels [67.34087272813821]
Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge.
Current text generation approaches lack refined control, resulting in texts that are not customized to readers' proficiency levels.
We develop three text generation techniques for controlling readability: instruction-based readability control, reinforcement learning to minimize the gap between requested and observed readability, and a decoding approach that uses look-ahead to estimate the readability of upcoming decoding steps.
arXiv Detail & Related papers (2023-10-16T17:46:26Z) - LC-Score: Reference-less estimation of Text Comprehension Difficulty [0.0]
We present textscLC-Score, a simple approach for training text comprehension metric for any French text without reference.
Our objective is to quantitatively capture the extend to which a text suits to the textitLangage Clair (LC, textitClear Language) guidelines.
We explore two approaches: (i) using linguistically motivated indicators used to train statistical models, and (ii) neural learning directly from text leveraging pre-trained language models.
arXiv Detail & Related papers (2023-10-04T11:49:37Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Beyond the Tip of the Iceberg: Assessing Coherence of Text Classifiers [0.05857406612420462]
Large-scale, pre-trained language models achieve human-level and superhuman accuracy on existing language understanding tasks.
We propose evaluating systems through a novel measure of prediction coherence.
arXiv Detail & Related papers (2021-09-10T15:04:23Z) - Readability Research: An Interdisciplinary Approach [62.03595526230364]
We aim to provide a firm foundation for readability research, a comprehensive framework for readability research.
Readability refers to aspects of visual information design which impact information flow from the page to the reader.
These aspects can be modified on-demand, instantly improving the ease with which a reader can process and derive meaning from text.
arXiv Detail & Related papers (2021-07-20T16:52:17Z) - On the Interpretability and Significance of Bias Metrics in Texts: a
PMI-based Approach [3.2326259807823026]
We analyze an alternative PMI-based metric to quantify biases in texts.
It can be expressed as a function of conditional probabilities, which provides a simple interpretation in terms of word co-occurrences.
arXiv Detail & Related papers (2021-04-13T19:34:17Z) - Interactive Fiction Game Playing as Multi-Paragraph Reading
Comprehension with Reinforcement Learning [94.50608198582636]
Interactive Fiction (IF) games with real human-written natural language texts provide a new natural evaluation for language understanding techniques.
We take a novel perspective of IF game solving and re-formulate it as Multi-Passage Reading (MPRC) tasks.
arXiv Detail & Related papers (2020-10-05T23:09:20Z) - Improving Machine Reading Comprehension with Contextualized Commonsense
Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension.
We propose to represent relations implicitly by situating structured knowledge in a context.
We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z) - Salience Estimation with Multi-Attention Learning for Abstractive Text
Summarization [86.45110800123216]
In the task of text summarization, salience estimation for words, phrases or sentences is a critical component.
We propose a Multi-Attention Learning framework which contains two new attention learning components for salience estimation.
arXiv Detail & Related papers (2020-04-07T02:38:56Z) - Text as Environment: A Deep Reinforcement Learning Text Readability
Assessment Model [2.826553192869411]
The efficiency of state-of-the-art text readability assessment models can be further improved using deep reinforcement learning models.
A comparison of the model on Weebit and Cambridge Exams with state-of-the-art models, such as the BERT text readability model, shows that it is capable of achieving state-of-the-art accuracy with a significantly smaller amount of input text than other models.
arXiv Detail & Related papers (2019-12-12T13:54:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.