Hierarchical Ranking Neural Network for Long Document Readability Assessment
- URL: http://arxiv.org/abs/2511.21473v1
- Date: Wed, 26 Nov 2025 15:05:22 GMT
- Title: Hierarchical Ranking Neural Network for Long Document Readability Assessment
- Authors: Yurui Zheng, Yijun Chen, Shaohong Zhang,
- Abstract summary: This paper proposes a bidirectional readability assessment mechanism that captures contextual information to identify regions with rich semantic information in the text.<n>These sentence-level labels are then used to assist in predicting the overall readability level of the document.<n>A pairwise sorting algorithm is introduced to model the ordinal relationship between readability levels through label subtraction.
- Score: 2.160803573421694
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Readability assessment aims to evaluate the reading difficulty of a text. In recent years, while deep learning technology has been gradually applied to readability assessment, most approaches fail to consider either the length of the text or the ordinal relationship of readability labels. This paper proposes a bidirectional readability assessment mechanism that captures contextual information to identify regions with rich semantic information in the text, thereby predicting the readability level of individual sentences. These sentence-level labels are then used to assist in predicting the overall readability level of the document. Additionally, a pairwise sorting algorithm is introduced to model the ordinal relationship between readability levels through label subtraction. Experimental results on Chinese and English datasets demonstrate that the proposed model achieves competitive performance and outperforms other baseline models.
Related papers
- Target-oriented Multimodal Sentiment Classification with Counterfactual-enhanced Debiasing [5.0175188046562385]
multimodal sentiment classification seeks to predict sentiment polarity for specific targets from image-text pairs.<n>Existing works often over-rely on textual content and fail to consider dataset biases.<n>We introduce a novel counterfactual-enhanced debiasing framework to reduce such spurious correlations.
arXiv Detail & Related papers (2025-09-11T05:40:53Z) - GUM-SAGE: A Novel Dataset and Approach for Graded Entity Salience Prediction [12.172254885579706]
Graded entity salience assigns entities scores that reflect their relative importance in a text.<n>We introduce a novel approach for graded entity salience that combines the strengths of both approaches.<n>Our approach shows stronger correlation with scores based on human summaries and alignments, and outperforms existing techniques.
arXiv Detail & Related papers (2025-04-15T01:26:14Z) - Readability Formulas, Systems and LLMs are Poor Predictors of Reading Ease [4.868319717279586]
We focus on a fundamental and understudied aspect of readability, real-time reading ease, captured with online reading measures using eye tracking.<n>Applying this evaluation to prominent traditional readability formulas, modern machine learning systems and commercial systems used in education, suggests that they are all poor predictors of reading ease in English.
arXiv Detail & Related papers (2025-02-16T14:51:44Z) - Generating Summaries with Controllable Readability Levels [67.34087272813821]
Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge.
Current text generation approaches lack refined control, resulting in texts that are not customized to readers' proficiency levels.
We develop three text generation techniques for controlling readability: instruction-based readability control, reinforcement learning to minimize the gap between requested and observed readability, and a decoding approach that uses look-ahead to estimate the readability of upcoming decoding steps.
arXiv Detail & Related papers (2023-10-16T17:46:26Z) - Conditional Supervised Contrastive Learning for Fair Text Classification [59.813422435604025]
We study learning fair representations that satisfy a notion of fairness known as equalized odds for text classification via contrastive learning.
Specifically, we first theoretically analyze the connections between learning representations with a fairness constraint and conditional supervised contrastive objectives.
arXiv Detail & Related papers (2022-05-23T17:38:30Z) - Multitask Learning for Class-Imbalanced Discourse Classification [74.41900374452472]
We show that a multitask approach can improve 7% Micro F1-score upon current state-of-the-art benchmarks.
We also offer a comparative review of additional techniques proposed to address resource-poor problems in NLP.
arXiv Detail & Related papers (2021-01-02T07:13:41Z) - Hierarchical Bi-Directional Self-Attention Networks for Paper Review
Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation.
Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three)
We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.