Evaluation of ChatGPT Feedback on ELL Writers' Coherence and Cohesion
- URL: http://arxiv.org/abs/2310.06505v1
- Date: Tue, 10 Oct 2023 10:25:56 GMT
- Title: Evaluation of ChatGPT Feedback on ELL Writers' Coherence and Cohesion
- Authors: Su-Youn Yoon, Eva Miszoglad, Lisa R. Pierce
- Abstract summary: ChatGPT has had a transformative effect on education where students are using it to help with homework assignments and teachers are actively employing it in their teaching practices.
This study evaluated the quality of the feedback generated by ChatGPT regarding the coherence and cohesion of the essays written by English Language learners (ELLs) students.
- Score: 0.7028778922533686
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since its launch in November 2022, ChatGPT has had a transformative effect on
education where students are using it to help with homework assignments and
teachers are actively employing it in their teaching practices. This includes
using ChatGPT as a tool for writing teachers to grade and generate feedback on
students' essays. In this study, we evaluated the quality of the feedback
generated by ChatGPT regarding the coherence and cohesion of the essays written
by English Language Learners (ELLs) students. We selected 50 argumentative
essays and generated feedback on coherence and cohesion using the ELLIPSE
rubric. During the feedback evaluation, we used a two-step approach: first,
each sentence in the feedback was classified into subtypes based on its
function (e.g., positive reinforcement, problem statement). Next, we evaluated
its accuracy and usability according to these types. Both the analysis of
feedback types and the evaluation of accuracy and usability revealed that most
feedback sentences were highly abstract and generic, failing to provide
concrete suggestions for improvement. The accuracy in detecting major problems,
such as repetitive ideas and the inaccurate use of cohesive devices, depended
on superficial linguistic features and was often incorrect. In conclusion,
ChatGPT, without specific training for the feedback generation task, does not
offer effective feedback on ELL students' coherence and cohesion.
Related papers
- Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation [67.88747330066049]
Fine-grained feedback captures nuanced distinctions in image quality and prompt-alignment.
We show that demonstrating its superiority to coarse-grained feedback is not automatic.
We identify key challenges in eliciting and utilizing fine-grained feedback.
arXiv Detail & Related papers (2024-06-24T17:19:34Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Improving the Validity of Automatically Generated Feedback via
Reinforcement Learning [50.067342343957876]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)
Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded Dialogues [52.95506649193427]
We introduce FEDI, the first English task-oriented and document-grounded dialogue dataset annotated with this information.
Experiments with Flan-T5, GPT-2 and Llama 2 show a particularly positive impact on task completion and factual consistency.
arXiv Detail & Related papers (2024-01-17T14:52:26Z) - Can ChatGPT Play the Role of a Teaching Assistant in an Introductory
Programming Course? [1.8197265299982013]
This paper explores the potential of using ChatGPT, an LLM, as a virtual Teaching Assistant (TA) in an introductory programming course.
We evaluate ChatGPT's capabilities by comparing its performance with that of human TAs in some of the important TA functions.
arXiv Detail & Related papers (2023-12-12T15:06:44Z) - Scalable Two-Minute Feedback: Digital, Lecture-Accompanying Survey as a Continuous Feedback Instrument [0.0]
Detailed feedback on courses and lecture content is essential for their improvement and also serves as a tool for reflection.
The article used a digital survey format as formative feedback which attempts to measure student stress in a quantitative part and to address the participants' reflection in a qualitative part.
The results show a low, but constant rate of feedback. Responses mostly cover topics of the lecture content or organizational aspects and were intensively used to report issues within the lecture.
arXiv Detail & Related papers (2023-10-30T08:14:26Z) - Exploring the effectiveness of ChatGPT-based feedback compared with
teacher feedback and self-feedback: Evidence from Chinese to English
translation [1.25097469793837]
ChatGPT, a cutting-edge AI-powered,can quickly generate responses on given commands.
This study compared the revised Chinese to English translation texts produced by Chinese Master of Translation and Interpretation (MTI) students.
arXiv Detail & Related papers (2023-09-04T14:54:39Z) - Distilling ChatGPT for Explainable Automated Student Answer Assessment [19.604476650824516]
We introduce a novel framework that explores using ChatGPT, a cutting-edge large language model, for the concurrent tasks of student answer scoring and rationale generation.
Our experiments show that the proposed method improves the overall QWK score by 11% compared to ChatGPT.
arXiv Detail & Related papers (2023-05-22T12:11:39Z) - Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback
with an Existing Taxonomy [0.0]
ChatGPT can achieve over 90% accuracy in labeling student comments.
This study contributes to the growing body of research on the use of AI models in educational contexts.
arXiv Detail & Related papers (2023-05-09T19:55:50Z) - DeltaScore: Fine-Grained Story Evaluation with Perturbations [69.33536214124878]
We introduce DELTASCORE, a novel methodology that employs perturbation techniques for the evaluation of nuanced story aspects.
Our central proposition posits that the extent to which a story excels in a specific aspect (e.g., fluency) correlates with the magnitude of its susceptibility to particular perturbations.
We measure the quality of an aspect by calculating the likelihood difference between pre- and post-perturbation states using pre-trained language models.
arXiv Detail & Related papers (2023-03-15T23:45:54Z) - A Unified Dual-view Model for Review Summarization and Sentiment
Classification with Inconsistency Loss [51.448615489097236]
Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms.
We propose a novel dual-view model that jointly improves the performance of these two tasks.
Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2020-06-02T13:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.