Improving Assessment of Tutoring Practices using Retrieval-Augmented
Generation
- URL: http://arxiv.org/abs/2402.14594v1
- Date: Sun, 4 Feb 2024 20:42:30 GMT
- Title: Improving Assessment of Tutoring Practices using Retrieval-Augmented
Generation
- Authors: Zifei (FeiFei) Han, Jionghao Lin, Ashish Gurung, Danielle R. Thomas,
Eason Chen, Conrad Borchers, Shivang Gupta, Kenneth R. Koedinger
- Abstract summary: One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies.
This study aims to harness Generative Pre-trained Transformers (GPT), such as GPT-3.5 and GPT-4 models, to automatically assess tutors' ability of using social-emotional tutoring strategies.
- Score: 10.419430731115405
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One-on-one tutoring is an effective instructional method for enhancing
learning, yet its efficacy hinges on tutor competencies. Novice math tutors
often prioritize content-specific guidance, neglecting aspects such as
social-emotional learning. Social-emotional learning promotes equity and
inclusion and nurturing relationships with students, which is crucial for
holistic student development. Assessing the competencies of tutors accurately
and efficiently can drive the development of tailored tutor training programs.
However, evaluating novice tutor ability during real-time tutoring remains
challenging as it typically requires experts-in-the-loop. To address this
challenge, this preliminary study aims to harness Generative Pre-trained
Transformers (GPT), such as GPT-3.5 and GPT-4 models, to automatically assess
tutors' ability of using social-emotional tutoring strategies. Moreover, this
study also reports on the financial dimensions and considerations of employing
these models in real-time and at scale for automated assessment. The current
study examined four prompting strategies: two basic Zero-shot prompt
strategies, Tree of Thought prompt, and Retrieval-Augmented Generator (RAG)
based prompt. The results indicate that the RAG prompt demonstrated more
accurate performance (assessed by the level of hallucination and correctness in
the generated assessment texts) and lower financial costs than the other
strategies evaluated. These findings inform the development of personalized
tutor training interventions to enhance the the educational effectiveness of
tutored learning.
Related papers
- Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment [66.6041949490137]
We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness.
Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes.
Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
arXiv Detail & Related papers (2024-11-17T00:13:00Z) - Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach [25.903775277417267]
Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher.
We argue that this is primarily due to the difficulties with verbalising pedagogical intuitions into gen AI prompts and the lack of good evaluation practices.
Here we present our work collaborating with learners and educators to translate high level principles from learning science into a pragmatic set of seven diverse educational benchmarks.
arXiv Detail & Related papers (2024-05-21T19:27:59Z) - Evaluating and Optimizing Educational Content with Large Language Model Judgments [52.33701672559594]
We use Language Models (LMs) as educational experts to assess the impact of various instructions on learning outcomes.
We introduce an instruction optimization approach in which one LM generates instructional materials using the judgments of another LM as a reward function.
Human teachers' evaluations of these LM-generated worksheets show a significant alignment between the LM judgments and human teacher preferences.
arXiv Detail & Related papers (2024-03-05T09:09:15Z) - Improving the Validity of Automatically Generated Feedback via
Reinforcement Learning [50.067342343957876]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)
Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - Using Large Language Models to Assess Tutors' Performance in Reacting to
Students Making Math Errors [2.099922236065961]
We investigate the capacity of generative AI to evaluate real-life tutors' performance in responding to students making math errors.
By analyzing 50 real-life tutoring dialogues, we find both GPT-3.5-Turbo and GPT-4 demonstrate proficiency in assessing the criteria related to reacting to students making errors.
GPT-4 tends to overidentify instances of students making errors, often attributing student uncertainty or inferring potential errors where human evaluators did not.
arXiv Detail & Related papers (2024-01-06T15:34:27Z) - Towards Goal-oriented Intelligent Tutoring Systems in Online Education [69.06930979754627]
We propose a new task, named Goal-oriented Intelligent Tutoring Systems (GITS)
GITS aims to enable the student's mastery of a designated concept by strategically planning a customized sequence of exercises and assessment.
We propose a novel graph-based reinforcement learning framework, named Planning-Assessment-Interaction (PAI)
arXiv Detail & Related papers (2023-12-03T12:37:16Z) - Implementing Learning Principles with a Personal AI Tutor: A Case Study [2.94944680995069]
This research demonstrates the ability of personal AI tutors to model human learning processes and effectively enhance academic performance.
By integrating AI tutors into their programs, educators can offer students personalized learning experiences grounded in the principles of learning sciences.
arXiv Detail & Related papers (2023-09-10T15:35:47Z) - Comparative Analysis of GPT-4 and Human Graders in Evaluating Praise
Given to Students in Synthetic Dialogues [2.3361634876233817]
Large language models, such as the AI-chatbot ChatGPT, hold potential for offering constructive feedback to tutors in practical settings.
The accuracy of AI-generated feedback remains uncertain, with scant research investigating the ability of models like ChatGPT to deliver effective feedback.
arXiv Detail & Related papers (2023-07-05T04:14:01Z) - A Machine Learning system to monitor student progress in educational
institutes [0.0]
We propose a data driven approach that makes use of Machine Learning techniques to generate a classifier called credit score.
The proposal to use credit score as progress indicator is well suited to be used in a Learning Management System.
arXiv Detail & Related papers (2022-11-02T08:24:08Z) - Teachable Reinforcement Learning via Advice Distillation [161.43457947665073]
We propose a new supervision paradigm for interactive learning based on "teachable" decision-making systems that learn from structured advice provided by an external teacher.
We show that agents that learn from advice can acquire new skills with significantly less human supervision than standard reinforcement learning algorithms.
arXiv Detail & Related papers (2022-03-19T03:22:57Z) - Distribution Matching for Machine Teaching [64.39292542263286]
Machine teaching is an inverse problem of machine learning that aims at steering the student learner towards its target hypothesis.
Previous studies on machine teaching focused on balancing the teaching risk and cost to find those best teaching examples.
This paper presents a distribution matching-based machine teaching strategy.
arXiv Detail & Related papers (2021-05-06T09:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.