Evaluating Trust in AI, Human, and Co-produced Feedback Among Undergraduate Students
- URL: http://arxiv.org/abs/2504.10961v1
- Date: Tue, 15 Apr 2025 08:06:36 GMT
- Title: Evaluating Trust in AI, Human, and Co-produced Feedback Among Undergraduate Students
- Authors: Audrey Zhang, Yifei Gao, Wannapon Suraworachet, Tanya Nazaretsky, Mutlu Cukurova,
- Abstract summary: Students generally preferred AI and co-produced feedback over human feedback in terms of perceived usefulness and objectivity.<n>Male students consistently rated all feedback types as less valuable than their female and non-binary counterparts.<n>These insights inform evidence-based guidelines for integrating AI into higher education feedback systems.
- Score: 2.935250567679577
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As generative AI transforms educational feedback practices, understanding students' perceptions of different feedback providers becomes crucial for effective implementation. This study addresses a critical gap by comparing undergraduate students' trust in AI-generated, human-created, and human-AI co-produced feedback, informing how institutions can adapt feedback practices in this new era. Through a within-subject experiment with 91 participants, we investigated factors predicting students' ability to distinguish between feedback types, perception of feedback quality, and potential biases to AI involvement. Findings revealed that students generally preferred AI and co-produced feedback over human feedback in terms of perceived usefulness and objectivity. Only AI feedback suffered a decline in perceived genuineness when feedback sources were revealed, while co-produced feedback maintained its positive perception. Educational AI experience improved students' ability to identify AI feedback and increased their trust in all feedback types, while general AI experience decreased perceived usefulness and credibility. Male students consistently rated all feedback types as less valuable than their female and non-binary counterparts. These insights inform evidence-based guidelines for integrating AI into higher education feedback systems while addressing trust concerns and fostering AI literacy among students.
Related papers
- Understanding and Supporting Peer Review Using AI-reframed Positive Summary [18.686807993563168]
This study explored the impact of appending an automatically generated positive summary to the peer reviews of a writing task.<n>We found that adding an AI-reframed positive summary to otherwise harsh feedback increased authors' critique acceptance.<n>We discuss the implications of using AI in peer feedback, focusing on how it can influence critique acceptance and support research communities.
arXiv Detail & Related papers (2025-03-13T11:22:12Z) - Personalised Feedback Framework for Online Education Programmes Using Generative AI [0.0]
This paper presents an alternative feedback framework which extends the capabilities of ChatGPT by integrating embeddings.
As part of the study, we proposed and developed a proof of concept solution, achieving an efficacy rate of 90% and 100% for open-ended and multiple-choice questions.
arXiv Detail & Related papers (2024-10-14T22:35:40Z) - Integrating AI for Enhanced Feedback in Translation Revision- A Mixed-Methods Investigation of Student Engagement [0.0]
The application of Artificial Intelligence (AI)-generated feedback, particularly from language models like ChatGPT, remains understudied in translation education.
This study investigates the engagement of master's students in translation with ChatGPT-generated feedback during their revision process.
arXiv Detail & Related papers (2024-10-11T07:21:29Z) - Aligning Large Language Models from Self-Reference AI Feedback with one General Principle [61.105703857868775]
We propose a self-reference-based AI feedback framework that enables a 13B Llama2-Chat to provide high-quality feedback.
Specifically, we allow the AI to first respond to the user's instructions, then generate criticism of other answers based on its own response as a reference.
Finally, we determine which answer better fits human preferences according to the criticism.
arXiv Detail & Related papers (2024-06-17T03:51:46Z) - ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking.
We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert.
We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Evaluating and Optimizing Educational Content with Large Language Model Judgments [52.33701672559594]
We use Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes.
We introduce an instruction optimization approach in which one LM generates instructional materials using the judgments of another LM as a reward function.
Human teachers' evaluations of these LM-generated worksheets show a significant alignment between the LM judgments and human teacher preferences.
arXiv Detail & Related papers (2024-03-05T09:09:15Z) - The Responsible Development of Automated Student Feedback with Generative AI [6.008616775722921]
Recent advancements in AI, particularly with large language models (LLMs), present new opportunities to deliver scalable, repeatable, and instant feedback.<n>However, implementing these technologies also introduces a host of ethical considerations that must thoughtfully be addressed.<n>One of the core advantages of AI systems is their ability to automate routine and mundane tasks, potentially freeing up human educators for more nuanced work.<n>However, the ease of automation risks a tyranny of the majority'', where the diverse needs of minority or unique learners are overlooked.
arXiv Detail & Related papers (2023-08-29T14:29:57Z) - Effects of Human vs. Automatic Feedback on Students' Understanding of AI
Concepts and Programming Style [0.0]
The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses.
There is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback.
This paper addresses this gap by splitting one 90-student class into two feedback groups and analyzing differences in the two cohorts' performance.
arXiv Detail & Related papers (2020-11-20T21:40:32Z) - Facial Feedback for Reinforcement Learning: A Case Study and Offline
Analysis Using the TAMER Framework [51.237191651923666]
We investigate the potential of agent learning from trainers' facial expressions via interpreting them as evaluative feedback.
With designed CNN-RNN model, our analysis shows that telling trainers to use facial expressions and competition can improve the accuracies for estimating positive and negative feedback.
Our results with a simulation experiment show that learning solely from predicted feedback based on facial expressions is possible.
arXiv Detail & Related papers (2020-01-23T17:50:57Z) - Artificial Artificial Intelligence: Measuring Influence of AI
'Assessments' on Moral Decision-Making [48.66982301902923]
We examined the effect of feedback from false AI on moral decision-making about donor kidney allocation.
We found some evidence that judgments about whether a patient should receive a kidney can be influenced by feedback about participants' own decision-making perceived to be given by AI.
arXiv Detail & Related papers (2020-01-13T14:15:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.