Related papers: Deep learning for sentence clustering in essay grading support

Deep learning for sentence clustering in essay grading support

URL: http://arxiv.org/abs/2104.11556v1
Date: Fri, 23 Apr 2021 12:32:51 GMT
Title: Deep learning for sentence clustering in essay grading support
Authors: Li-Hsin Chang, Iiro Rastas, Sampo Pyysalo, Filip Ginter
Abstract summary: We introduce two datasets of undergraduate student essays in Finnish, manually annotated for salient arguments on the sentence level. We evaluate several deep-learning embedding methods for their suitability to sentence clustering in support of essay grading.
Score: 1.7259867886009057
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Essays as a form of assessment test student knowledge on a deeper level than short answer and multiple-choice questions. However, the manual evaluation of essays is time- and labor-consuming. Automatic clustering of essays, or their fragments, prior to manual evaluation presents a possible solution to reducing the effort required in the evaluation process. Such clustering presents numerous challenges due to the variability and ambiguity of natural language. In this paper, we introduce two datasets of undergraduate student essays in Finnish, manually annotated for salient arguments on the sentence level. Using these datasets, we evaluate several deep-learning embedding methods for their suitability to sentence clustering in support of essay grading. We find that the choice of the most suitable method depends on the nature of the exam question and the answers, with deep-learning methods being capable of, but not guaranteeing better performance over simpler methods based on lexical overlap.

Related papers

Question Generation for Assessing Early Literacy Reading Comprehension [7.209603871896803]
We propose a novel approach for generating comprehension questions geared to K-2 English learners.<n>Our method ensures complete coverage of the underlying material and adaptation to the learner's specific proficiencies.
arXiv Detail & Related papers (2025-07-30T06:27:02Z)
Monocle: Hybrid Local-Global In-Context Evaluation for Long-Text Generation with Uncertainty-Based Active Learning [63.531262595858]
Divide-and-conquer approach breaks comprehensive evaluation task into localized scoring tasks, followed by a final global assessment.<n>We introduce a hybrid in-context learning approach that leverages human annotations to enhance the performance of both local and global evaluations.<n>Finally, we develop an uncertainty-based active learning algorithm that efficiently selects data samples for human annotation.
arXiv Detail & Related papers (2025-05-26T16:39:41Z)
Improve LLM-based Automatic Essay Scoring with Linguistic Features [46.41475844992872]
This paper develops a scoring system capable of handling essays across diverse prompts. Existing methods typically fall into two categories: supervised feature-based approaches and large language model (LLM)-based methods.
arXiv Detail & Related papers (2025-02-13T17:09:52Z)
Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance. We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs [0.41436032949434404]
We develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets. We show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue.
arXiv Detail & Related papers (2024-08-19T07:14:15Z)
Automating Easy Read Text Segmentation [2.7309692684728617]
Easy Read text is one of the main forms of access to information for people with reading difficulties. One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments. We study novel methods for the task, leveraging masked and generative language models, along with constituent parsing.
arXiv Detail & Related papers (2024-06-17T12:25:25Z)
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews [25.291384842659397]
We introduce sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews. Unlike traditional consensus-based methods, sys extracts both common and unique opinions from the reviews.
arXiv Detail & Related papers (2024-06-11T15:27:01Z)
Graded Relevance Scoring of Written Essays with Dense Retrieval [4.021352247826289]
We propose a novel approach for graded relevance scoring of written essays that employs dense retrieval encoders. We leverage Contriever, which is pre-trained with contrastive learning and demonstrated comparable performance to supervised dense retrieval models. Our method establishes a new state-of-the-art performance in the task-specific scenario, while its extension for the cross-task scenario exhibited a performance that is on par with the state-of-the-art model for that scenario.
arXiv Detail & Related papers (2024-05-08T16:37:58Z)
One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets. We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z)
Teach model to answer questions after comprehending the document [1.4264737570114632]
Multi-choice Machine Reading (MRC) is a challenging extension of Natural Language Processing (NLP) We propose a two-stage knowledge distillation method that teaches the model to better comprehend the document by dividing the MRC task into two separate stages.
arXiv Detail & Related papers (2023-07-18T02:38:02Z)
RankCSE: Unsupervised Sentence Representations Learning via Learning to Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning. It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework. An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z)
Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively. We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z)
Learning Opinion Summarizers by Selecting Informative Reviews [81.47506952645564]
We collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training. The content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates. We formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets.
arXiv Detail & Related papers (2021-09-09T15:01:43Z)
Toward the Understanding of Deep Text Matching Models for Information Retrieval [72.72380690535766]
This paper aims at testing whether existing deep text matching methods satisfy some fundamental gradients in information retrieval. Specifically, four attributions are used in our study, i.e., term frequency constraint, term discrimination constraint, length normalization constraints, and TF-length constraint. Experimental results on LETOR 4.0 and MS Marco show that all the investigated deep text matching methods satisfy the above constraints with high probabilities in statistics.
arXiv Detail & Related papers (2021-08-16T13:33:15Z)
Hierarchical Bi-Directional Self-Attention Networks for Paper Review Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation. Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three) We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.