Related papers: Using Generative AI and Multi-Agents to Provide Automatic Feedback

Related papers

LLM-based Multimodal Feedback Produces Equivalent Learning and Better Student Perceptions than Educator Feedback [4.225232488376583]
This study introduces a real-time AI-facilitated multimodal feedback system that integrates structured textual explanations with dynamic multimedia resources.<n>In an online crowdsourcing experiment, we compared this system against fixed business-as-usual feedback by educators across three dimensions.<n>Results showed that AI multimodal feedback achieved learning gains equivalent to original educator feedback while significantly outperforming it on perceived clarity, specificity, conciseness, motivation, satisfaction, and reducing cognitive load.
arXiv Detail & Related papers (2026-01-21T18:58:08Z)
Let the Barbarians In: How AI Can Accelerate Systems Performance Research [80.43506848683633]
We term this iterative cycle of generation, evaluation, and refinement AI-Driven Research for Systems.<n>We demonstrate that ADRS-generated solutions can match or even outperform human state-of-the-art designs.
arXiv Detail & Related papers (2025-12-16T18:51:23Z)
SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z)
AgentEvolver: Towards Efficient Self-Evolving Agent System [51.54882384204726]
We present AgentEvolver, a self-evolving agent system that drives autonomous agent learning.<n>AgentEvolver introduces three synergistic mechanisms: self-questioning, self-navigating, and self-attributing.<n>Preliminary experiments indicate that AgentEvolver achieves more efficient exploration, better sample utilization, and faster adaptation compared to traditional RL-based baselines.
arXiv Detail & Related papers (2025-11-13T15:14:47Z)
A Survey on Feedback Types in Automated Programming Assessment Systems [3.9845307287664973]
This study investigates how different feedback mechanisms in APASs are perceived by students, and how effective they are in supporting problem-solving.<n>Results indicate that while students rate unit test feedback as the most helpful, AI-generated feedback leads to significantly better performances.
arXiv Detail & Related papers (2025-10-21T09:08:22Z)
Human-in-the-Loop Systems for Adaptive Learning Using Generative AI [0.6780998887296331]
Student-driven feedback loops can modify AI-generated responses for improved student retention and engagement.<n>Preliminary findings from a study with STEM students indicate improved learning outcomes and confidence compared to traditional AI tools.
arXiv Detail & Related papers (2025-08-14T20:44:34Z)
Can Language Models Critique Themselves? Investigating Self-Feedback for Retrieval Augmented Generation at BioASQ 2025 [1.6819960041696331]
RAG and 'deep research' systems aim to enable autonomous search processes where Large Language Models (LLMs) iteratively refine outputs.<n>Applying these systems to domain-specific professional search, such as biomedical research, presents challenges.<n>We investigated whether this iterative self-correction improves performance and if reasoning models are more capable of generating useful feedback.
arXiv Detail & Related papers (2025-08-07T13:13:19Z)
UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations [71.79210031338464]
We show how to unify dense retrieval and response generation for large language models in conversation.<n>We conduct joint fine-tuning with different objectives and design two mechanisms to reduce the inconsistency risks.<n>The evaluations on five conversational search datasets demonstrate that our unified model can mutually improve both tasks and outperform the existing baselines.
arXiv Detail & Related papers (2025-07-09T17:02:40Z)
AutoGen Driven Multi Agent Framework for Iterative Crime Data Analysis and Prediction [0.3749861135832073]
This paper introduces LUCID-MA, an innovative AI powered framework where multiple AI agents collaboratively analyze and understand crime data.<n>Our system that consists of three core components: an analysis assistant highlights crime patterns; a feedback component that reviews and refines analytical results; and a prediction component that forecasts future crime trends.<n>A scoring function is incorporated to evaluate agent performance, providing visual plots to track learning progress.
arXiv Detail & Related papers (2025-06-13T05:39:28Z)
Evolution of AI in Education: Agentic Workflows [2.1681971652284857]
Artificial intelligence (AI) has transformed various aspects of education. Large language models (LLMs) are driving advancements in automated tutoring, assessment, and content generation. To address these limitations and foster more sustainable technological practices, AI agents have emerged as a promising new avenue for educational innovation.
arXiv Detail & Related papers (2025-04-25T13:44:57Z)
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows [71.92083784393418]
Agentic AI (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low.<n>Inference-time alignment relies on three components: sampling, evaluation, and feedback.<n>We introduce Iterative Agent Decoding (IAD), a procedure that repeatedly inserts feedback extracted from different forms of critiques.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization [56.674356045200696]
We propose a novel method to train AI agents to incorporate knowledge and skills for multiple tasks without the need for cumbersome note systems or prior high-quality demonstration data. Our approach employs an iterative process where the agent collects new experiences, receives corrective feedback from humans in the form of hints, and integrates this feedback into its weights. We demonstrate the efficacy of our approach by implementing it in a Llama-3-based agent which, after only a few rounds of feedback, outperforms advanced models GPT-4o and DeepSeek-V3 in a taskset.
arXiv Detail & Related papers (2025-02-03T17:45:46Z)
A Zero-Shot LLM Framework for Automatic Assignment Grading in Higher Education [0.6141800972050401]
We propose a Zero-Shot Large Language Model (LLM)-Based Automated Assignment Grading (AAG) system. This framework leverages prompt engineering to evaluate both computational and explanatory student responses without requiring additional training or fine-tuning. The AAG system delivers tailored feedback that highlights individual strengths and areas for improvement, thereby enhancing student learning outcomes.
arXiv Detail & Related papers (2025-01-24T08:01:41Z)
Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses [0.0]
This study aims to explore the potential of Large Language Models (LLMs) in facilitating automated feedback in math education. We employ Mistral, a version of Llama catered to math, and fine-tune this model for evaluating student responses by leveraging a dataset of student responses and teacher-written feedback for middle-school math problems. We evaluate the model's performance in scoring accuracy and the quality of feedback by utilizing judgments from 2 teachers.
arXiv Detail & Related papers (2024-10-29T16:57:45Z)
MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation [0.4857223913212445]
We propose a novel system, MIRROR, to automate the evaluation process for questions generated by automated question generation systems. We observed that the scores of human evaluation metrics, namely relevance, appropriateness, novelty, complexity, and grammaticality, improved when using the feedback-based approach called MIRROR.
arXiv Detail & Related papers (2024-10-16T12:24:42Z)
ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning [78.42927884000673]
ExACT is an approach to combine test-time search and self-learning to build o1-like models for agentic applications. We first introduce Reflective Monte Carlo Tree Search (R-MCTS), a novel test time algorithm designed to enhance AI agents' ability to explore decision space on the fly. Next, we introduce Exploratory Learning, a novel learning strategy to teach agents to search at inference time without relying on any external search algorithms.
arXiv Detail & Related papers (2024-10-02T21:42:35Z)
To Err Is AI! Debugging as an Intervention to Facilitate Appropriate Reliance on AI Systems [11.690126756498223]
Vision for optimal human-AI collaboration requires 'appropriate reliance' of humans on AI systems. In practice, the performance disparity of machine learning models on out-of-distribution data makes dataset-specific performance feedback unreliable.
arXiv Detail & Related papers (2024-09-22T09:43:27Z)
Re-ReST: Reflection-Reinforced Self-Training for Language Agents [101.22559705696885]
Self-training in language agents can generate supervision from the agent itself. We present Reflection-Reinforced Self-Training (Re-ReST), which uses a textitreflector to refine low-quality generated samples.
arXiv Detail & Related papers (2024-06-03T16:21:38Z)
Self-Improving Customer Review Response Generation Based on LLMs [1.9274286238176854]
SCRABLE represents an adaptive customer review response automation that enhances itself with self-optimizing prompts. We introduce an automatic scoring mechanism that mimics the role of a human evaluator to assess the quality of responses generated in customer review domains.
arXiv Detail & Related papers (2024-05-06T20:50:17Z)
A Multimodal Automated Interpretability Agent [63.8551718480664]
MAIA is a system that uses neural models to automate neural model understanding tasks. We first characterize MAIA's ability to describe (neuron-level) features in learned representations of images. We then show that MAIA can aid in two additional interpretability tasks: reducing sensitivity to spurious features, and automatically identifying inputs likely to be mis-classified.
arXiv Detail & Related papers (2024-04-22T17:55:11Z)
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models [56.08917291606421]
ResearchAgent is an AI-based system for ideation and operationalization of novel work. ResearchAgent automatically defines novel problems, proposes methods and designs experiments, while iteratively refining them. We experimentally validate our ResearchAgent on scientific publications across multiple disciplines.
arXiv Detail & Related papers (2024-04-11T13:36:29Z)
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [50.067342343957876]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL) Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z)
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent [50.508669199496474]
We develop a ReAct-style LLM agent with the ability to reason and act upon external knowledge. We refine the agent through a ReST-like method that iteratively trains on previous trajectories. Starting from a prompted large model and after just two iterations of the algorithm, we can produce a fine-tuned small model.
arXiv Detail & Related papers (2023-12-15T18:20:15Z)
Lessons Learned from EXMOS User Studies: A Technical Report Summarizing Key Takeaways from User Studies Conducted to Evaluate The EXMOS Platform [5.132827811038276]
Two user studies aimed at illuminating the influence of different explanation types on three key dimensions: trust, understandability, and model improvement. Results show that global model-centric explanations alone are insufficient for effectively guiding users during the intricate process of data configuration. We present essential implications for developing interactive machine-learning systems driven by explanations.
arXiv Detail & Related papers (2023-10-03T14:04:45Z)
Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs) The system is into three inter-connected core processes-interaction, reflection, and reaction. Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.