Related papers: Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model

Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model

URL: http://arxiv.org/abs/1912.05957v4
Date: Mon, 23 Oct 2023 13:22:09 GMT
Title: Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model
Authors: Hamid Mohammadi, Seyed Hossein Khasteh, Tahereh Firoozi, Taha Samavati
Abstract summary: The efficiency of state-of-the-art text readability assessment models can be further improved using deep reinforcement learning models. A comparison of the model on Weebit and Cambridge Exams with state-of-the-art models, such as the BERT text readability model, shows that it is capable of achieving state-of-the-art accuracy with a significantly smaller amount of input text than other models.
Score: 2.826553192869411
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Evaluating the readability of a text can significantly facilitate the precise expression of information in written form. The formulation of text readability assessment involves the identification of meaningful properties of the text regardless of its length. Sophisticated features and models are used to evaluate the comprehensibility of texts accurately. Despite this, the problem of assessing texts' readability efficiently remains relatively untouched. The efficiency of state-of-the-art text readability assessment models can be further improved using deep reinforcement learning models. Using a hard attention-based active inference technique, the proposed approach makes efficient use of input text and computational resources. Through the use of semi-supervised signals, the reinforcement learning model uses the minimum amount of text in order to determine text's readability. A comparison of the model on Weebit and Cambridge Exams with state-of-the-art models, such as the BERT text readability model, shows that it is capable of achieving state-of-the-art accuracy with a significantly smaller amount of input text than other models.

Related papers

A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition [32.142713322062306]
Text recognition systems often depend on large end-to-end architectures that require extensive training and are prohibitively expensive for real-time scenarios. We propose a training-free plug-and-play framework that leverages the strengths of pre-trained text recognizers while minimizing redundant computations. Our approach uses context-based understanding and introduces an attention-based segmentation stage, which refines candidate text regions at the pixel level.
arXiv Detail & Related papers (2025-03-19T18:51:01Z)
GALOT: Generative Active Learning via Optimizable Zero-shot Text-to-image Generation [21.30138131496276]
This paper integrates the zero-shot text-to-image (T2I) synthesis and active learning. We leverage the AL criteria to optimize the text inputs for generating more informative and diverse data samples. This approach reduces the cost of data collection and annotation while increasing the efficiency of model training.
arXiv Detail & Related papers (2024-12-18T18:40:21Z)
Analysing Zero-Shot Readability-Controlled Sentence Simplification [54.09069745799918]
We investigate how different types of contextual information affect a model's ability to generate sentences with the desired readability. Results show that all tested models struggle to simplify sentences due to models' limitations and characteristics of the source sentences. Our experiments also highlight the need for better automatic evaluation metrics tailored to RCTS.
arXiv Detail & Related papers (2024-09-30T12:36:25Z)
Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model. A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances. Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [71.85120354973073]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems. Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs) We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
Efficiently Leveraging Linguistic Priors for Scene Text Spotting [63.22351047545888]
This paper proposes a method that leverages linguistic knowledge from a large text corpus to replace the traditional one-hot encoding used in auto-regressive scene text spotting and recognition models. We generate text distributions that align well with scene text datasets, removing the need for in-domain fine-tuning. Experimental results show that our method not only improves recognition accuracy but also enables more accurate localization of words.
arXiv Detail & Related papers (2024-02-27T01:57:09Z)
How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored. Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges. We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z)
Prompt-based Learning for Text Readability Assessment [0.4757470449749875]
We propose the novel adaptation of a pre-trained seq2seq model for readability assessment. We prove that a seq2seq model can be adapted to discern which text is more difficult from two given texts (pairwise)
arXiv Detail & Related papers (2023-02-25T18:39:59Z)
STA: Self-controlled Text Augmentation for Improving Text Classifications [2.9669250132689164]
A number of text augmentation techniques have emerged in the field of Natural Language Processing (NLP) We introduce a state-of-the-art approach for Self-Controlled Text Augmentation (STA) Our approach tightly controls the generation process by introducing a self-checking procedure to ensure that generated examples retain the semantic content of the original text.
arXiv Detail & Related papers (2023-02-24T17:54:12Z)
A Transfer Learning Based Model for Text Readability Assessment in German [4.550811027560416]
We propose a new model for text complexity assessment for German text based on transfer learning. Best model is based on the BERT pre-trained language model achieved the Root Mean Square Error (RMSE) of 0.483.
arXiv Detail & Related papers (2022-07-13T15:15:44Z)
Evaluating Factuality in Text Simplification [43.94402649899681]
We introduce a taxonomy of errors that we use to analyze both references drawn from standard simplification datasets and state-of-the-art model outputs. We find that errors often appear in both that are not captured by existing evaluation metrics.
arXiv Detail & Related papers (2022-04-15T17:37:09Z)
Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes. An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.