Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o
- URL: http://arxiv.org/abs/2602.22524v1
- Date: Thu, 26 Feb 2026 01:46:40 GMT
- Title: Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o
- Authors: Samay Bhojwani, Swarnima Kain, Lisong Xu,
- Abstract summary: This paper presents an empirical study on dyslexia-friendly text summarization using an iterative prompt-based refinement pipeline built on GPT-4o.<n>We evaluate the pipeline on approximately 2,000 news article samples, applying a readability target of Flesch Reading Ease >= 90.<n>Results show that the majority of summaries meet the readability threshold within four attempts, with many succeeding on the first try.
- Score: 1.4401311275746886
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dyslexia affects approximately 10% of the global population and presents persistent challenges in reading fluency and text comprehension. While existing assistive technologies address visual presentation, linguistic complexity remains a substantial barrier to equitable access. This paper presents an empirical study on dyslexia-friendly text summarization using an iterative prompt-based refinement pipeline built on GPT-4o. We evaluate the pipeline on approximately 2,000 news article samples, applying a readability target of Flesch Reading Ease >= 90. Results show that the majority of summaries meet the readability threshold within four attempts, with many succeeding on the first try. A composite score combining readability and semantic fidelity shows stable performance across the dataset, ranging from 0.13 to 0.73 with a typical value near 0.55. These findings establish an empirical baseline for accessibility-driven NLP summarization and motivate further human-centered evaluation with dyslexic readers.
Related papers
- On Text Simplification Metrics and General-Purpose LLMs for Accessible Health Information, and A Potential Architectural Advantage of The Instruction-Tuned LLM class [2.568600731965475]
This report empirically assesses the performance of two major classes of general-purpose language models.<n>We identify a potential architectural advantage in the instruction-tuned Mistral 24B and the reasoning-augmented QWen2.5 32B.<n> Mistral exhibits a tempered lexical simplification strategy that enhances readability across a suite of metrics and the simplification-specific formula SARI.<n>QWen also attains enhanced readability performance, but its operational strategy shows a disconnect in balancing between readability and accuracy.
arXiv Detail & Related papers (2025-11-07T08:53:39Z) - Diversity Boosts AI-Generated Text Detection [51.56484100374058]
DivEye is a novel framework that captures how unpredictability fluctuates across a text using surprisal-based features.<n>Our method outperforms existing zero-shot detectors by up to 33.2% and achieves competitive performance with fine-tuned baselines.
arXiv Detail & Related papers (2025-09-23T10:21:22Z) - SEval-Ex: A Statement-Level Framework for Explainable Summarization Evaluation [2.0027415925559966]
Current approaches face a trade-off between performance and interpretability.<n>We present SEval-Ex, a framework that bridges this gap by decomposing summarization evaluation into atomic statements.<n> Experiments on the SummEval benchmark demonstrate that SEval-Ex achieves state-of-the-art performance with 0.580 correlation on consistency with human consistency judgments.
arXiv Detail & Related papers (2025-05-04T20:16:08Z) - Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection [58.419940585826744]
We introduce FairOPT, an algorithm for group-specific threshold optimization for probabilistic AI-text detectors.<n>We partitioned data into subgroups based on attributes (e.g., text length and writing style) and implemented FairOPT to learn decision thresholds for each group to reduce discrepancy.<n>Our framework paves the way for more robust classification in AI-generated content detection via post-processing.
arXiv Detail & Related papers (2025-02-06T21:58:48Z) - SummExecEdit: A Factual Consistency Benchmark in Summarization with Executable Edits [31.98028879922584]
We introduce SummExecEdit, a novel pipeline and benchmark to assess models on their ability to both detect factual errors and provide accurate explanations.<n>The top-performing model, Claude3-Opus, achieves a joint detection and explanation score of only 0.49 in our benchmark.<n>We identify four primary types of explanation errors, with 45.4% of them involving a focus on completely unrelated parts of the summary.
arXiv Detail & Related papers (2024-12-17T23:26:44Z) - Measuring and Modifying the Readability of English Texts with GPT-4 [2.532202013576547]
We find readability estimates from GPT-4 Turbo and GPT-4o mini exhibit relatively high correlation with human judgments.
In a pre-registered human experiment, we ask whether Turbo can reliably make text easier or harder to read.
We find evidence to support this hypothesis, though considerable variance in human judgments remains unexplained.
arXiv Detail & Related papers (2024-10-17T21:04:28Z) - Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - Leveraging deep active learning to identify low-resource mobility
functioning information in public clinical notes [0.157286095422595]
First public annotated dataset specifically on the Mobility domain of the International Classification of Functioning, Disability and Health (ICF)
We utilize the National NLP Clinical Challenges (n2c2) research dataset to construct a pool of candidate sentences using keyword expansion.
Our final dataset consists of 4,265 sentences with a total of 11,784 entities, including 5,511 Action entities, 5,328 Mobility entities, 306 Assistance entities, and 639 Quantification entities.
arXiv Detail & Related papers (2023-11-27T15:53:11Z) - Generating Summaries with Controllable Readability Levels [67.34087272813821]
Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge.
Current text generation approaches lack refined control, resulting in texts that are not customized to readers' proficiency levels.
We develop three text generation techniques for controlling readability: instruction-based readability control, reinforcement learning to minimize the gap between requested and observed readability, and a decoding approach that uses look-ahead to estimate the readability of upcoming decoding steps.
arXiv Detail & Related papers (2023-10-16T17:46:26Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - Retrieval-based Disentangled Representation Learning with Natural
Language Supervision [61.75109410513864]
We present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.
Our approach employ a bi-encoder model to represent both data and natural language in a vocabulary space, enabling the model to distinguish intrinsic dimensions that capture characteristics within data through its natural language counterpart, thus disentanglement.
arXiv Detail & Related papers (2022-12-15T10:20:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.