Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation
- URL: http://arxiv.org/abs/2405.06681v1
- Date: Sun, 5 May 2024 18:32:06 GMT
- Title: Leveraging Lecture Content for Improved Feedback: Explorations with GPT-4 and Retrieval Augmented Generation
- Authors: Sven Jacobs, Steffen Jaschke,
- Abstract summary: This paper presents the use of Retrieval Augmented Generation to improve the feedback generated by Large Language Models for programming tasks.
corresponding lecture recordings were transcribed and made available to the Large Language Model GPT-4 as external knowledge source.
The purpose of this is to prevent hallucinations and to enforce the use of the technical terms and phrases from the lecture.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the use of Retrieval Augmented Generation (RAG) to improve the feedback generated by Large Language Models for programming tasks. For this purpose, corresponding lecture recordings were transcribed and made available to the Large Language Model GPT-4 as external knowledge source together with timestamps as metainformation by using RAG. The purpose of this is to prevent hallucinations and to enforce the use of the technical terms and phrases from the lecture. In an exercise platform developed to solve programming problems for an introductory programming lecture, students can request feedback on their solutions generated by GPT-4. For this task GPT-4 receives the students' code solution, the compiler output, the result of unit tests and the relevant passages from the lecture notes available through the use of RAG as additional context. The feedback generated by GPT-4 should guide students to solve problems independently and link to the lecture content, using the time stamps of the transcript as meta-information. In this way, the corresponding lecture videos can be viewed immediately at the corresponding positions. For the evaluation, students worked with the tool in a workshop and decided for each feedback whether it should be extended by RAG or not. First results based on a questionnaire and the collected usage data show that the use of RAG can improve feedback generation and is preferred by students in some situations. Due to the slower speed of feedback generation, the benefits are situation dependent.
Related papers
- Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation [72.70046559930555]
We propose a generic RAG approach called Adaptive Note-Enhanced RAG (Adaptive-Note) for complex QA tasks.
Specifically, Adaptive-Note introduces an overarching view of knowledge growth, iteratively gathering new information in the form of notes.
In addition, we employ an adaptive, note-based stop-exploration strategy to decide "what to retrieve and when to stop" to encourage sufficient knowledge exploration.
arXiv Detail & Related papers (2024-10-11T14:03:29Z) - GPT-4 as a Homework Tutor can Improve Student Engagement and Learning Outcomes [80.60912258178045]
We developed a prompting strategy that enables GPT-4 to conduct interactive homework sessions for high-school students learning English as a second language.
We carried out a Randomized Controlled Trial (RCT) in four high-school classes, replacing traditional homework with GPT-4 homework sessions for the treatment group.
We observed significant improvements in learning outcomes, specifically a greater gain in grammar, and student engagement.
arXiv Detail & Related papers (2024-09-24T11:22:55Z) - Evaluating the Application of Large Language Models to Generate Feedback in Programming Education [0.0]
This study investigates the application of large language models, specifically GPT-4, to enhance programming education.
The research outlines the design of a web application that uses GPT-4 to provide feedback on programming tasks, without giving away the solution.
arXiv Detail & Related papers (2024-03-13T23:14:35Z) - Feedback-Generation for Programming Exercises With GPT-4 [0.0]
This paper explores the quality of GPT-4 Turbo's generated output for prompts containing both the programming task specification and a student's submission as input.
The output was qualitatively analyzed regarding correctness, personalization, fault localization, and other features identified in the material.
arXiv Detail & Related papers (2024-03-07T12:37:52Z) - Improving the Validity of Automatically Generated Feedback via
Reinforcement Learning [50.067342343957876]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)
Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation [87.44350003888646]
Eval-Instruct can acquire pointwise grading critiques with pseudo references and revise these critiques via multi-path prompting.
CritiqueLLM is empirically shown to outperform ChatGPT and all the open-source baselines.
arXiv Detail & Related papers (2023-11-30T16:52:42Z) - Question-Answering Approach to Evaluating Legal Summaries [0.43512163406551996]
GPT-4 is used to generate a set of question-answer pairs that cover main points and information in the reference summary.
GPT-4 is then used to generate answers based on the generated summary for the questions from the reference summary.
GPT-4 grades the answers from the reference summary and the generated summary.
arXiv Detail & Related papers (2023-09-26T15:36:29Z) - Large Language Models (GPT) for automating feedback on programming
assignments [0.0]
We employ OpenAI's GPT-3.5 model to generate personalized hints for students solving programming assignments.
Students rated the usefulness of GPT-generated hints positively.
arXiv Detail & Related papers (2023-06-30T21:57:40Z) - Instruction Tuning with GPT-4 [107.55078894215798]
We present the first attempt to use GPT-4 to generate instruction-following data for finetuning large language models.
Our early experiments on instruction-tuned LLaMA models show that the 52K English and Chinese instruction-following data generated by GPT-4 leads to superior zero-shot performance on new tasks.
arXiv Detail & Related papers (2023-04-06T17:58:09Z) - Error syntax aware augmentation of feedback comment generation dataset [116.73173348201341]
This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning.
In terms of this task given a text with an error and a span of the error, a system generates an explanatory note that helps the writer (language learner) to improve their writing skills.
arXiv Detail & Related papers (2022-12-29T12:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.