Using Generative AI and Multi-Agents to Provide Automatic Feedback
- URL: http://arxiv.org/abs/2411.07407v1
- Date: Mon, 11 Nov 2024 22:27:36 GMT
- Title: Using Generative AI and Multi-Agents to Provide Automatic Feedback
- Authors: Shuchen Guo, Ehsan Latif, Yifan Zhou, Xuan Huang, Xiaoming Zhai,
- Abstract summary: This study investigates the use of generative AI and multi-agent systems to provide automatic feedback in educational contexts.
The research addresses a key gap in the field by exploring how multi-agent systems, called AutoFeedback, can improve the quality of GenAI-generated feedback.
- Score: 4.883570605293337
- License:
- Abstract: This study investigates the use of generative AI and multi-agent systems to provide automatic feedback in educational contexts, particularly for student constructed responses in science assessments. The research addresses a key gap in the field by exploring how multi-agent systems, called AutoFeedback, can improve the quality of GenAI-generated feedback, overcoming known issues such as over-praise and over-inference that are common in single-agent large language models (LLMs). The study developed a multi-agent system consisting of two AI agents: one for generating feedback and another for validating and refining it. The system was tested on a dataset of 240 student responses, and its performance was compared to that of a single-agent LLM. Results showed that AutoFeedback significantly reduced the occurrence of over-praise and over-inference errors, providing more accurate and pedagogically sound feedback. The findings suggest that multi-agent systems can offer a more reliable solution for generating automated feedback in educational settings, highlighting their potential for scalable and personalized learning support. These results have important implications for educators and researchers seeking to leverage AI in formative assessments, offering a pathway to more effective feedback mechanisms that enhance student learning outcomes.
Related papers
- Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses [0.0]
This study aims to explore the potential of Large Language Models (LLMs) in facilitating automated feedback in math education.
We employ Mistral, a version of Llama catered to math, and fine-tune this model for evaluating student responses by leveraging a dataset of student responses and teacher-written feedback for middle-school math problems.
We evaluate the model's performance in scoring accuracy and the quality of feedback by utilizing judgments from 2 teachers.
arXiv Detail & Related papers (2024-10-29T16:57:45Z) - MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation [0.4857223913212445]
We propose a novel system, MIRROR, to automate the evaluation process for questions generated by automated question generation systems.
We observed that the scores of human evaluation metrics, namely relevance, appropriateness, novelty, complexity, and grammaticality, improved when using the feedback-based approach called MIRROR.
arXiv Detail & Related papers (2024-10-16T12:24:42Z) - ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning [78.42927884000673]
ExACT is an approach to combine test-time search and self-learning to build o1-like models for agentic applications.
We first introduce Reflective Monte Carlo Tree Search (R-MCTS), a novel test time algorithm designed to enhance AI agents' ability to explore decision space on the fly.
Next, we introduce Exploratory Learning, a novel learning strategy to teach agents to search at inference time without relying on any external search algorithms.
arXiv Detail & Related papers (2024-10-02T21:42:35Z) - To Err Is AI! Debugging as an Intervention to Facilitate Appropriate Reliance on AI Systems [11.690126756498223]
Vision for optimal human-AI collaboration requires 'appropriate reliance' of humans on AI systems.
In practice, the performance disparity of machine learning models on out-of-distribution data makes dataset-specific performance feedback unreliable.
arXiv Detail & Related papers (2024-09-22T09:43:27Z) - Re-ReST: Reflection-Reinforced Self-Training for Language Agents [101.22559705696885]
Self-training in language agents can generate supervision from the agent itself.
We present Reflection-Reinforced Self-Training (Re-ReST), which uses a textitreflector to refine low-quality generated samples.
arXiv Detail & Related papers (2024-06-03T16:21:38Z) - Self-Improving Customer Review Response Generation Based on LLMs [1.9274286238176854]
SCRABLE represents an adaptive customer review response automation that enhances itself with self-optimizing prompts.
We introduce an automatic scoring mechanism that mimics the role of a human evaluator to assess the quality of responses generated in customer review domains.
arXiv Detail & Related papers (2024-05-06T20:50:17Z) - A Multimodal Automated Interpretability Agent [63.8551718480664]
MAIA is a system that uses neural models to automate neural model understanding tasks.
We first characterize MAIA's ability to describe (neuron-level) features in learned representations of images.
We then show that MAIA can aid in two additional interpretability tasks: reducing sensitivity to spurious features, and automatically identifying inputs likely to be mis-classified.
arXiv Detail & Related papers (2024-04-22T17:55:11Z) - Improving the Validity of Automatically Generated Feedback via
Reinforcement Learning [50.067342343957876]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)
Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent [50.508669199496474]
We develop a ReAct-style LLM agent with the ability to reason and act upon external knowledge.
We refine the agent through a ReST-like method that iteratively trains on previous trajectories.
Starting from a prompted large model and after just two iterations of the algorithm, we can produce a fine-tuned small model.
arXiv Detail & Related papers (2023-12-15T18:20:15Z) - Lessons Learned from EXMOS User Studies: A Technical Report Summarizing
Key Takeaways from User Studies Conducted to Evaluate The EXMOS Platform [5.132827811038276]
Two user studies aimed at illuminating the influence of different explanation types on three key dimensions: trust, understandability, and model improvement.
Results show that global model-centric explanations alone are insufficient for effectively guiding users during the intricate process of data configuration.
We present essential implications for developing interactive machine-learning systems driven by explanations.
arXiv Detail & Related papers (2023-10-03T14:04:45Z) - Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs)
The system is into three inter-connected core processes-interaction, reflection, and reaction.
Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.