Related papers: Which Feedback Works for Whom? Differential Effects of LLM-Generated Feedback Elements Across Learner Profiles

Which Feedback Works for Whom? Differential Effects of LLM-Generated Feedback Elements Across Learner Profiles

URL: http://arxiv.org/abs/2602.11650v1
Date: Thu, 12 Feb 2026 07:02:33 GMT
Title: Which Feedback Works for Whom? Differential Effects of LLM-Generated Feedback Elements Across Learner Profiles
Authors: Momoka Furuhashi, Kouta Nakayama, Noboru Kawai, Takashi Kodama, Saku Sugawara, Kyosuke Takami,
Abstract summary: We define six feedback elements and generate feedback for biology questions using GPT-5.<n>We evaluate feedback effectiveness using two learning outcomes measures and subjective evaluations across six criteria.<n>Our results show that effective feedback elements share common patterns supporting learning outcomes, while learners' subjective preferences differ across personality-based clusters.
Score: 9.104700955592568
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) show promise for automatically generating feedback in education settings. However, it remains unclear how specific feedback elements, such as tone and information coverage, contribute to learning outcomes and learner acceptance, particularly across learners with different personality traits. In this study, we define six feedback elements and generate feedback for multiple-choice biology questions using GPT-5. We conduct a learning experiment with 321 first-year high school students and evaluate feedback effectiveness using two learning outcomes measures and subjective evaluations across six criteria. We further analyze differences in how feedback acceptance varies across learners based on Big Five personality traits. Our results show that effective feedback elements share common patterns supporting learning outcomes, while learners' subjective preferences differ across personality-based clusters. These findings highlight the importance of selecting and adapting feedback elements according to learners' personality traits when we design LLM-generated feedback, and provide practical implications for personalized feedback design in education.

Related papers

Student Engagement with GenAI's Tutoring Feedback: A Mixed Methods Study [0.0]
The research aims to: (1) identify what students think when they engage with the tutoring feedback components, and (2) explore the relations between the feedback components, students' visual attention, verbalized thoughts, and their immediate actions as part of the problem-solving process.<n>The analysis of students' thoughts while engaging with 380 feedback components revealed four main themes: students express understanding or disagreement, additional information needed, and students explicitly judge the feedback.
arXiv Detail & Related papers (2025-09-26T22:17:20Z)
Evaluating Trust in AI, Human, and Co-produced Feedback Among Undergraduate Students [2.935250567679577]
This study compares undergraduate students' trust in large language models (LLMs), human, and human-AI co-produced feedback in their authentic HE context.<n>Findings revealed students preferred AI and co-produced feedback over human feedback regarding perceived usefulness and objectivity.<n>Educational AI experience improved students' ability to identify LLM-generated feedback and increased their trust in all types of feedback.
arXiv Detail & Related papers (2025-04-15T08:06:36Z)
You're (Not) My Type -- Can LLMs Generate Feedback of Specific Types for Introductory Programming Tasks? [0.4779196219827508]
This paper aims to generate specific types of feedback for programming tasks using Large Language Models (LLMs)<n>We revisit existing feedback to capture the specifics of the generated feedback, such as randomness, uncertainty, and degrees of variation.<n>Results have implications for future feedback research with regard to, for example, feedback effects and learners' informational needs.
arXiv Detail & Related papers (2024-12-04T17:57:39Z)
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback. The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied. We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z)
Evaluating and Optimizing Educational Content with Large Language Model Judgments [52.33701672559594]
We use Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes. We introduce an instruction optimization approach in which one LM generates instructional materials using the judgments of another LM as a reward function. Human teachers' evaluations of these LM-generated worksheets show a significant alignment between the LM judgments and human teacher preferences.
arXiv Detail & Related papers (2024-03-05T09:09:15Z)
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [46.667783153759636]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)<n>Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z)
Constructive Large Language Models Alignment with Diverse Feedback [76.9578950893839]
We introduce Constructive and Diverse Feedback (CDF) as a novel method to enhance large language models alignment. We exploit critique feedback for easy problems, refinement feedback for medium problems, and preference feedback for hard problems. By training our model with this diversified feedback, we achieve enhanced alignment performance while using less training data.
arXiv Detail & Related papers (2023-10-10T09:20:14Z)
UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset. Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z)
Exploring the Relationship Between Personality Traits and User Feedback [9.289846887298852]
We present a preliminary study about the effect of personality traits on user feedback. 56 university students provided feedback on different software features of an e-learning tool used in the course. Results suggest that sensitivity to frustration and lower stress tolerance may negatively impact the feedback of users.
arXiv Detail & Related papers (2023-07-22T10:10:27Z)
Training Language Models with Language Feedback at Scale [50.70091340506957]
We introduce learning from Language Feedback (ILF), a new approach that utilizes more informative language feedback. ILF consists of three steps that are applied iteratively: first, conditioning the language model on the input, an initial LM output, and feedback to generate refinements. We show theoretically that ILF can be viewed as Bayesian Inference, similar to Reinforcement Learning from human feedback.
arXiv Detail & Related papers (2023-03-28T17:04:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.