Towards Prompt Generalization: Grammar-aware Cross-Prompt Automated Essay Scoring
- URL: http://arxiv.org/abs/2502.08450v1
- Date: Wed, 12 Feb 2025 14:41:20 GMT
- Title: Towards Prompt Generalization: Grammar-aware Cross-Prompt Automated Essay Scoring
- Authors: Heejin Do, Taehee Park, Sangwon Ryu, Gary Geunbae Lee,
- Abstract summary: We propose a grammar-aware cross-prompt trait scoring (GAPS) to learn generic essay representation.
We acquire grammatical error-corrected information in essays via the grammar error correction technique and design the AES model to seamlessly integrate such information.
GAPS achieves notable QWK gains in the most challenging cross-prompt scenario, highlighting its strength in evaluating unseen prompts.
- Score: 5.906031288935515
- License:
- Abstract: In automated essay scoring (AES), recent efforts have shifted toward cross-prompt settings that score essays on unseen prompts for practical applicability. However, prior methods trained with essay-score pairs of specific prompts pose challenges in obtaining prompt-generalized essay representation. In this work, we propose a grammar-aware cross-prompt trait scoring (GAPS), which internally captures prompt-independent syntactic aspects to learn generic essay representation. We acquire grammatical error-corrected information in essays via the grammar error correction technique and design the AES model to seamlessly integrate such information. By internally referring to both the corrected and the original essays, the model can focus on generic features during training. Empirical experiments validate our method's generalizability, showing remarkable improvements in prompt-independent and grammar-related traits. Furthermore, GAPS achieves notable QWK gains in the most challenging cross-prompt scenario, highlighting its strength in evaluating unseen prompts.
Related papers
- Eliciting Textual Descriptions from Representations of Continuous Prompts [11.489611613744724]
We propose a new approach to interpret continuous prompts that elicits textual descriptions from their representations during model inference.
We show our method often yields accurate task descriptions which become more faithful as task performance increases.
InSPEcT can be leveraged to debug unwanted properties in continuous prompts and inform developers on ways to mitigate them.
arXiv Detail & Related papers (2024-10-15T14:46:11Z) - Localizing Factual Inconsistencies in Attributable Text Generation [91.981439746404]
We introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation.
We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation.
We then implement several methods for automatically detecting localized factual inconsistencies.
arXiv Detail & Related papers (2024-10-09T22:53:48Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text
Classification [65.51149771074944]
MetricPrompt eases verbalizer design difficulty by reformulating few-shot text classification task into text pair relevance estimation task.
We conduct experiments on three widely used text classification datasets across four few-shot settings.
Results show that MetricPrompt outperforms manual verbalizer and other automatic verbalizer design methods across all few-shot settings.
arXiv Detail & Related papers (2023-06-15T06:51:35Z) - Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring [3.6825890616838066]
Automated essay scoring (AES) aims to score essays written for a given prompt, which defines the writing topic.
Most existing AES systems assume to grade essays of the same prompt as used in training and assign only a holistic score.
We propose a robust model: prompt- and trait relation-aware cross-prompt essay trait scorer.
arXiv Detail & Related papers (2023-05-26T11:11:19Z) - Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Prompt Agnostic Essay Scorer: A Domain Generalization Approach to
Cross-prompt Automated Essay Scoring [61.21967763569547]
Cross-prompt automated essay scoring (AES) requires the system to use non target-prompt essays to award scores to a target-prompt essay.
This paper introduces Prompt Agnostic Essay Scorer (PAES) for cross-prompt AES.
Our method requires no access to labelled or unlabelled target-prompt data during training and is a single-stage approach.
arXiv Detail & Related papers (2020-08-04T10:17:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.