Related papers: AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails

AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails

URL: http://arxiv.org/abs/2402.09216v3
Date: Thu, 25 Apr 2024 13:15:55 GMT
Title: AutoTutor meets Large Language Models: A Language Model Tutor with Rich Pedagogy and Guardrails
Authors: Sankalan Pal Chowdhury, Vilém Zouhar, Mrinmaya Sachan,
Abstract summary: Large Language Models (LLMs) have found several use cases in education, ranging from automatic question generation to essay evaluation. In this paper, we explore the potential of using Large Language Models (LLMs) to author Intelligent Tutoring Systems. We create a sample end-to-end tutoring system named MWPTutor, which uses LLMs to fill in the state space of a pre-defined finite state transducer.
Score: 43.19453208130667
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have found several use cases in education, ranging from automatic question generation to essay evaluation. In this paper, we explore the potential of using Large Language Models (LLMs) to author Intelligent Tutoring Systems. A common pitfall of LLMs is their straying from desired pedagogical strategies such as leaking the answer to the student, and in general, providing no guarantees. We posit that while LLMs with certain guardrails can take the place of subject experts, the overall pedagogical design still needs to be handcrafted for the best learning results. Based on this principle, we create a sample end-to-end tutoring system named MWPTutor, which uses LLMs to fill in the state space of a pre-defined finite state transducer. This approach retains the structure and the pedagogy of traditional tutoring systems that has been developed over the years by learning scientists but brings in additional flexibility of LLM-based approaches. Through a human evaluation study on two datasets based on math word problems, we show that our hybrid approach achieves a better overall tutoring score than an instructed, but otherwise free-form, GPT-4. MWPTutor is completely modular and opens up the scope for the community to improve its performance by improving individual modules or using different teaching strategies that it can follow.

Related papers

Partnering with AI: A Pedagogical Feedback System for LLM Integration into Programming Education [19.441958600393342]
This paper introduces a novel framework for large language models (LLMs)-driven feedback generation.<n>Our findings suggest that teachers consider that, when aligned with the framework, LLMs can effectively support students.<n>However, we found several limitations, such as its inability to adapt feedback to dynamic classroom contexts.
arXiv Detail & Related papers (2025-07-01T03:48:48Z)
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning [76.09281171131941]
Large language models (LLMs) can transform education, but their optimization for direct question-answering often undermines effective pedagogy.<n>We propose an online reinforcement learning (RL)-based alignment framework that can quickly adapt LLMs into effective tutors.
arXiv Detail & Related papers (2025-05-21T15:00:07Z)
Alignment Drift in CEFR-prompted LLMs for Interactive Spanish Tutoring [0.0]
This paper investigates the potentials of Large Language Models (LLMs) as adaptive tutors in the context of second-language learning.<n>We simulate full teacher-student dialogues in Spanish using instruction-tuned, open-source LLMs ranging in size from 7B to 12B parameters.<n>The output from the tutor model is then used to evaluate the effectiveness of CEFR-based prompting to control text difficulty across three proficiency levels.
arXiv Detail & Related papers (2025-05-13T08:50:57Z)
Can Large Language Models Match Tutoring System Adaptivity? A Benchmarking Study [0.0]
Large Language Models (LLMs) hold promise as dynamic instructional aids. Yet, it remains unclear whether LLMs can replicate the adaptivity of intelligent tutoring systems (ITS)
arXiv Detail & Related papers (2025-04-07T23:57:32Z)
Training LLM-based Tutors to Improve Student Learning Outcomes in Dialogues [46.60683274479208]
We introduce an approach to train large language models (LLMs) to generate tutor utterances that maximize the likelihood of student correctness. We show that tutor utterances generated by our model lead to significantly higher chances of correct student responses.
arXiv Detail & Related papers (2025-03-09T03:38:55Z)
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors [76.1634959528817]
We present MathTutorBench, an open-source benchmark for holistic tutoring model evaluation. MathTutorBench contains datasets and metrics that broadly cover tutor abilities as defined by learning sciences research in dialog-based teaching. We evaluate a wide set of closed- and open-weight models and find that subject expertise, indicated by solving ability, does not immediately translate to good teaching.
arXiv Detail & Related papers (2025-02-26T08:43:47Z)
TutorLLM: Customizing Learning Recommendations with Knowledge Tracing and Retrieval-Augmented Generation [44.18659233932457]
TutorLLM is a personalized learning recommender system based on Knowledge Tracing (KT) and Retrieval-Augmented Generation (RAG) The novelty of TutorLLM lies in its unique combination of KT and RAG techniques with LLMs, which enables dynamic retrieval of context-specific knowledge. The evaluation includes user assessment questionnaires and performance metrics, demonstrating a 10% improvement in user satisfaction.
arXiv Detail & Related papers (2025-01-20T21:18:43Z)
Developing a Tutoring Dialog Dataset to Optimize LLMs for Educational Use [1.2277343096128712]
Large language models (LLMs) have shown promise for scalable educational applications. Our study explores the use of smaller, more affordable LLMs for one-on-one tutoring in the context of solving reading comprehension problems.
arXiv Detail & Related papers (2024-10-25T00:40:21Z)
Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure [36.83786872708736]
One-to-one tutoring is one of the most efficient methods of teaching. We create a prototype tutor for high school math following Productive Failure (PF), an advanced and effective learning design. We quantitatively show that StratL succeeds in steering the LLM to follow a Productive Failure tutoring strategy.
arXiv Detail & Related papers (2024-10-03T16:15:41Z)
Exploring Knowledge Tracing in Tutor-Student Dialogues using LLMs [49.18567856499736]
We investigate whether large language models (LLMs) can be supportive of open-ended dialogue tutoring. We apply a range of knowledge tracing (KT) methods on the resulting labeled data to track student knowledge levels over an entire dialogue. We conduct experiments on two tutoring dialogue datasets, and show that a novel yet simple LLM-based method, LLMKT, significantly outperforms existing KT methods in predicting student response correctness in dialogues.
arXiv Detail & Related papers (2024-09-24T22:31:39Z)
BIPED: Pedagogically Informed Tutoring System for ESL Education [11.209992106075788]
Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) Existing CITS are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. We construct a BIlingual PEDagogically-informed Tutoring dataset of one-on-one, human-to-human English tutoring interactions.
arXiv Detail & Related papers (2024-06-05T17:49:24Z)
The Life Cycle of Large Language Models: A Review of Biases in Education [3.8757867335422485]
Large Language Models (LLMs) are increasingly adopted in educational contexts to provide personalized support to students and teachers. The integration of LLMs in education technology has renewed concerns over algorithmic bias which may exacerbate educational inequities. This review aims to clarify the complex nature of bias in LLM applications and provide practical guidance for their evaluation to promote educational equity.
arXiv Detail & Related papers (2024-06-03T18:00:28Z)
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world. Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting. We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Democratizing Reasoning Ability: Tailored Learning from Large Language Model [97.4921006089966]
We propose a tailored learning approach to distill such reasoning ability to smaller LMs. We exploit the potential of LLM as a reasoning teacher by building an interactive multi-round learning paradigm. To exploit the reasoning potential of the smaller LM, we propose self-reflection learning to motivate the student to learn from self-made mistakes.
arXiv Detail & Related papers (2023-10-20T07:50:10Z)
A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z)
Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL) In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula. In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.