Deep learning for sentence clustering in essay grading support
- URL: http://arxiv.org/abs/2104.11556v1
- Date: Fri, 23 Apr 2021 12:32:51 GMT
- Title: Deep learning for sentence clustering in essay grading support
- Authors: Li-Hsin Chang, Iiro Rastas, Sampo Pyysalo, Filip Ginter
- Abstract summary: We introduce two datasets of undergraduate student essays in Finnish, manually annotated for salient arguments on the sentence level.
We evaluate several deep-learning embedding methods for their suitability to sentence clustering in support of essay grading.
- Score: 1.7259867886009057
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Essays as a form of assessment test student knowledge on a deeper level than
short answer and multiple-choice questions. However, the manual evaluation of
essays is time- and labor-consuming. Automatic clustering of essays, or their
fragments, prior to manual evaluation presents a possible solution to reducing
the effort required in the evaluation process. Such clustering presents
numerous challenges due to the variability and ambiguity of natural language.
In this paper, we introduce two datasets of undergraduate student essays in
Finnish, manually annotated for salient arguments on the sentence level. Using
these datasets, we evaluate several deep-learning embedding methods for their
suitability to sentence clustering in support of essay grading. We find that
the choice of the most suitable method depends on the nature of the exam
question and the answers, with deep-learning methods being capable of, but not
guaranteeing better performance over simpler methods based on lexical overlap.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Paired Completion: Flexible Quantification of Issue-framing at Scale with LLMs [0.41436032949434404]
We develop and rigorously evaluate new detection methods for issue framing and narrative analysis within large text datasets.
We show that issue framing can be reliably and efficiently detected in large corpora with only a few examples of either perspective on a given issue.
arXiv Detail & Related papers (2024-08-19T07:14:15Z) - Automating Easy Read Text Segmentation [2.7309692684728617]
Easy Read text is one of the main forms of access to information for people with reading difficulties.
One of the key characteristics of this type of text is the requirement to split sentences into smaller grammatical segments.
We study novel methods for the task, leveraging masked and generative language models, along with constituent parsing.
arXiv Detail & Related papers (2024-06-17T12:25:25Z) - GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews [25.291384842659397]
We introduce sys, a summarization method designed to offer a concise yet comprehensive overview of scholarly reviews.
Unlike traditional consensus-based methods, sys extracts both common and unique opinions from the reviews.
arXiv Detail & Related papers (2024-06-11T15:27:01Z) - Graded Relevance Scoring of Written Essays with Dense Retrieval [4.021352247826289]
We propose a novel approach for graded relevance scoring of written essays that employs dense retrieval encoders.
We leverage Contriever, which is pre-trained with contrastive learning and demonstrated comparable performance to supervised dense retrieval models.
Our method establishes a new state-of-the-art performance in the task-specific scenario, while its extension for the cross-task scenario exhibited a performance that is on par with the state-of-the-art model for that scenario.
arXiv Detail & Related papers (2024-05-08T16:37:58Z) - One-Shot Learning as Instruction Data Prospector for Large Language Models [108.81681547472138]
textscNuggets uses one-shot learning to select high-quality instruction data from extensive datasets.
We show that instruction tuning with the top 1% of examples curated by textscNuggets substantially outperforms conventional methods employing the entire dataset.
arXiv Detail & Related papers (2023-12-16T03:33:12Z) - Teach model to answer questions after comprehending the document [1.4264737570114632]
Multi-choice Machine Reading (MRC) is a challenging extension of Natural Language Processing (NLP)
We propose a two-stage knowledge distillation method that teaches the model to better comprehend the document by dividing the MRC task into two separate stages.
arXiv Detail & Related papers (2023-07-18T02:38:02Z) - Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively.
We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z) - Learning Opinion Summarizers by Selecting Informative Reviews [81.47506952645564]
We collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training.
The content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates.
We formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets.
arXiv Detail & Related papers (2021-09-09T15:01:43Z) - Toward the Understanding of Deep Text Matching Models for Information
Retrieval [72.72380690535766]
This paper aims at testing whether existing deep text matching methods satisfy some fundamental gradients in information retrieval.
Specifically, four attributions are used in our study, i.e., term frequency constraint, term discrimination constraint, length normalization constraints, and TF-length constraint.
Experimental results on LETOR 4.0 and MS Marco show that all the investigated deep text matching methods satisfy the above constraints with high probabilities in statistics.
arXiv Detail & Related papers (2021-08-16T13:33:15Z) - Hierarchical Bi-Directional Self-Attention Networks for Paper Review
Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation.
Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three)
We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.