Related papers: Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation

Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation

URL: http://arxiv.org/abs/2405.06681v1
Date: Sun, 5 May 2024 18:32:06 GMT
Title: Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation
Authors: Sven Jacobs, Steffen Jaschke,
Abstract summary: This paper presents the use of Retrieval Augmented Generation to improve the feedback generated by Large Language Models for programming tasks. corresponding lecture recordings were transcribed and made available to the Large Language Model GPT-4 as external knowledge source. The purpose of this is to prevent hallucinations and to enforce the use of the technical terms and phrases from the lecture.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, corresponding lecture recordings were transcribed and made available to the Large Language Model GPT-4 as external knowledge source together with timestamps as metainformation by using RAG. The purpose of this is to prevent hallucinations and to enforce the use of the technical terms and phrases from the lecture. In an exercise platform developed to solve programming problems for an introductory programming lecture, students can request feedback on their solutions generated by GPT-4. For this task GPT-4 receives the students' code solution, the compiler output, the result of unit tests and the relevant passages from the lecture notes available through the use of RAG as additional context. The feedback generated by GPT-4 should guide students to solve problems independently and link to the lecture content, using the time stamps of the transcript as meta-information. In this way, the corresponding lecture videos can be viewed immediately at the corresponding positions. For the evaluation, students worked with the tool in a workshop and decided for each feedback whether it should be extended by RAG or not. First results based on a questionnaire and the collected usage data show that the use of RAG can improve feedback generation and is preferred by students in some situations. Due to the slower speed of feedback generation, the benefits are situation dependent.

Related papers

SteLLA: A Structured Grading System Using LLMs with RAG [2.630522349105014]
We present SteLLA (Structured Grading System Using LLMs with RAG) in which a) Retrieval Augmented Generation (RAG) is used to empower LLMs on the ASAG task. A real-world dataset that contains students' answers in an exam was collected from a college-level Biology course. Experiments show that our proposed system can achieve substantial agreement with the human grader while providing break-down grades and feedback on all the knowledge points examined in the problem.
arXiv Detail & Related papers (2025-01-15T19:24:48Z)
Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams [48.99818550820575]
We leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams. Using real student responses to questions in a probability theory exam, we evaluate GPT-4o's alignment with ground-truth scores from human graders using various prompting techniques.
arXiv Detail & Related papers (2024-11-07T22:51:47Z)
Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation [72.70046559930555]
We propose a generic RAG approach called Adaptive Note-Enhanced RAG (Adaptive-Note) for complex QA tasks. Specifically, Adaptive-Note introduces an overarching view of knowledge growth, iteratively gathering new information in the form of notes. In addition, we employ an adaptive, note-based stop-exploration strategy to decide "what to retrieve and when to stop" to encourage sufficient knowledge exploration.
arXiv Detail & Related papers (2024-10-11T14:03:29Z)
GPT-4 as a Homework Tutor can Improve Student Engagement and Learning Outcomes [80.60912258178045]
We developed a prompting strategy that enables GPT-4 to conduct interactive homework sessions for high-school students learning English as a second language. We carried out a Randomized Controlled Trial (RCT) in four high-school classes, replacing traditional homework with GPT-4 homework sessions for the treatment group. We observed significant improvements in learning outcomes, specifically a greater gain in grammar, and student engagement.
arXiv Detail & Related papers (2024-09-24T11:22:55Z)
Evaluating the Application of Large Language Models to Generate Feedback in Programming Education [0.0]
This study investigates the application of large language models, specifically GPT-4, to enhance programming education. The research outlines the design of a web application that uses GPT-4 to provide feedback on programming tasks, without giving away the solution.
arXiv Detail & Related papers (2024-03-13T23:14:35Z)
Feedback-Generation for Programming Exercises With GPT-4 [0.0]
This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input. The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material.
arXiv Detail & Related papers (2024-03-07T12:37:52Z)
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [50.067342343957876]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL) Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z)
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation [87.44350003888646]
Eval-Instruct can acquire pointwise grading critiques with pseudo references and revise these critiques via multi-path prompting. CritiqueLLM is empirically shown to outperform ChatGPT and all the open-source baselines.
arXiv Detail & Related papers (2023-11-30T16:52:42Z)
Question-Answering Approach to Evaluating Legal Summaries [0.43512163406551996]
GPT-4 is used to generate a set of question-answer pairs that cover main points and information in the reference summary. GPT-4 is then used to generate answers based on the generated summary for the questions from the reference summary. GPT-4 grades the answers from the reference summary and the generated summary.
arXiv Detail & Related papers (2023-09-26T15:36:29Z)
Instruction Tuning with GPT-4 [107.55078894215798]
We present the first attempt to use GPT-4 to generate instruction-following data for finetuning large language models. Our early experiments on instruction-tuned LLaMA models show that the 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks.
arXiv Detail & Related papers (2023-04-06T17:58:09Z)
Error syntax aware augmentation of feedback comment generation dataset [116.73173348201341]
This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning. In terms of this task given a text with an error and a span of the error, a system generates an explanatory note that helps the writer (language learner) to improve their writing skills.
arXiv Detail & Related papers (2022-12-29T12:57:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.