Generating Language Corrections for Teaching Physical Control Tasks
- URL: http://arxiv.org/abs/2306.07012v1
- Date: Mon, 12 Jun 2023 10:31:16 GMT
- Title: Generating Language Corrections for Teaching Physical Control Tasks
- Authors: Megha Srivastava, Noah Goodman, Dorsa Sadigh
- Abstract summary: CORGI is a model trained to generate language corrections for physical control tasks.
We show that CORGI can (i) generate valid feedback for novel student trajectories, (ii) outperform baselines on domains with novel control dynamics, and (iii) improve student learning in an interactive drawing task.
- Score: 21.186109830294072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI assistance continues to help advance applications in education, from
language learning to intelligent tutoring systems, yet current methods for
providing students feedback are still quite limited. Most automatic feedback
systems either provide binary correctness feedback, which may not help a
student understand how to improve, or require hand-coding feedback templates,
which may not generalize to new domains. This can be particularly challenging
for physical control tasks, where the rich diversity in student behavior and
specialized domains make it challenging to leverage general-purpose assistive
tools for providing feedback. We design and build CORGI, a model trained to
generate language corrections for physical control tasks, such as learning to
ride a bike. CORGI takes in as input a pair of student and expert trajectories,
and then generates natural language corrections to help the student improve. We
collect and train CORGI over data from three diverse physical control tasks
(drawing, steering, and joint movement). Through both automatic and human
evaluations, we show that CORGI can (i) generate valid feedback for novel
student trajectories, (ii) outperform baselines on domains with novel control
dynamics, and (iii) improve student learning in an interactive drawing task.
Related papers
- Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors [29.04639728020965]
We propose a novel agent workflow, Trace-and-Verify (TRAVER), which combines knowledge tracing to estimate a student's knowledge state and turn-by-turn verification to ensure effective guidance toward task completion.
Experiments reveal the challenges of coding tutoring and demonstrate that TRAVER achieves a significantly higher success rate.
arXiv Detail & Related papers (2025-02-18T22:13:00Z) - Dynamic Skill Adaptation for Large Language Models [78.31322532135272]
We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel and complex skills to Large Language Models (LLMs)
For every skill, we utilize LLMs to generate both textbook-like data which contains detailed descriptions of skills for pre-training and exercise-like data which targets at explicitly utilizing the skills to solve problems for instruction-tuning.
Experiments on large language models such as LLAMA and Mistral demonstrate the effectiveness of our proposed methods in adapting math reasoning skills and social study skills.
arXiv Detail & Related papers (2024-12-26T22:04:23Z) - WIP: A Unit Testing Framework for Self-Guided Personalized Online Robotics Learning [3.613641107321095]
This paper focuses on creating a system for unit testing while integrating it into the course workflow.
In line with the framework's personalized student-centered approach, this method makes it easier for students to revise, and debug their programming work.
The course workflow updated to include unit tests will strengthen the learning environment and make it more interactive so that students can learn how to program robots in a self-guided fashion.
arXiv Detail & Related papers (2024-05-18T00:56:46Z) - Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [46.667783153759636]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)
Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework.
It emulates the teacher-student education process to improve the efficacy of model fine-tuning.
Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z) - Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections [45.420679219101245]
We present Distillation and Retrieval of Online Corrections (DROC)
DROC is a large language model (LLM)-based system that can respond to arbitrary forms of language feedback.
We demonstrate that DROC effectively distills the relevant information from the sequence of online corrections in a knowledge base.
arXiv Detail & Related papers (2023-11-17T18:00:20Z) - Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs)
The system is into three inter-connected core processes-interaction, reflection, and reaction.
Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.