Automated Feedback on Student-Generated UML and ER Diagrams Using Large Language Models
- URL: http://arxiv.org/abs/2507.23470v1
- Date: Thu, 31 Jul 2025 11:49:01 GMT
- Title: Automated Feedback on Student-Generated UML and ER Diagrams Using Large Language Models
- Authors: Sebastian Gürtl, Gloria Schimetta, David Kerschbaumer, Michael Liut, Alexander Steinmaurer,
- Abstract summary: We introduce DUET (Diamatic & ER Tutor), a prototype of an LLM-based tool.<n>It converts a reference diagram and a student-submitted diagram into a textual representation and provides structured feedback based on the differences.<n>It uses a multi-stage LLM pipeline to compare diagrams and generate reflective feedback.<n>It enables analytical insights for educators, aiming to foster self-directed learning and inform instructional strategies.
- Score: 39.58317527488534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: UML and ER diagrams are foundational in computer science education but come with challenges for learners due to the need for abstract thinking, contextual understanding, and mastery of both syntax and semantics. These complexities are difficult to address through traditional teaching methods, which often struggle to provide scalable, personalized feedback, especially in large classes. We introduce DUET (Diagrammatic UML & ER Tutor), a prototype of an LLM-based tool, which converts a reference diagram and a student-submitted diagram into a textual representation and provides structured feedback based on the differences. It uses a multi-stage LLM pipeline to compare diagrams and generate reflective feedback. Furthermore, the tool enables analytical insights for educators, aiming to foster self-directed learning and inform instructional strategies. We evaluated DUET through semi-structured interviews with six participants, including two educators and four teaching assistants. They identified strengths such as accessibility, scalability, and learning support alongside limitations, including reliability and potential misuse. Participants also suggested potential improvements, such as bulk upload functionality and interactive clarification features. DUET presents a promising direction for integrating LLMs into modeling education and offers a foundation for future classroom integration and empirical evaluation.
Related papers
- ELMES: An Automated Framework for Evaluating Large Language Models in Educational Scenarios [23.549720214649476]
Large Language Models (LLMs) present transformative opportunities for education, generating numerous novel application scenarios.<n>Current benchmarks predominantly measure general intelligence rather than pedagogical capabilities.<n>We introduce ELMES, an open-source automated evaluation framework specifically designed for assessing LLMs in educational settings.
arXiv Detail & Related papers (2025-07-27T15:20:19Z) - Decoupled Visual Interpretation and Linguistic Reasoning for Math Problem Solving [57.22004912994658]
Current large vision-language models (LVLMs) typically employ a connector module to link visual features with text embeddings of large language models (LLMs)<n>This paper proposes a paradigm shift: instead of training end-to-end vision-language reasoning models, we advocate for developing a decoupled reasoning framework.
arXiv Detail & Related papers (2025-05-23T08:18:00Z) - From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning [76.09281171131941]
Large language models (LLMs) can transform education, but their optimization for direct question-answering often undermines effective pedagogy.<n>We propose an online reinforcement learning (RL)-based alignment framework that can quickly adapt LLMs into effective tutors.
arXiv Detail & Related papers (2025-05-21T15:00:07Z) - Explain with Visual Keypoints Like a Real Mentor! A Benchmark for Multimodal Solution Explanation [19.4261670152456]
We introduce the multimodal solution explanation task, designed to evaluate whether models can identify visual keypoints, such as auxiliary lines, points, angles, and generate explanations that incorporate these key elements essential for understanding.<n>Our empirical results show that, aside from recent large-scale open-source and closed-source models, most generalist open-source models, and even math-specialist models, struggle with the multimodal solution explanation task.<n>This highlights a significant gap in current LLMs' ability to reason and explain with visual grounding in educational contexts.
arXiv Detail & Related papers (2025-04-04T06:03:13Z) - Enhanced Bloom's Educational Taxonomy for Fostering Information Literacy in the Era of Large Language Models [16.31527042425208]
This paper proposes an LLM-driven Bloom's Educational Taxonomy that aims to recognize and evaluate students' information literacy (IL) with Large Language Models (LLMs)<n>The framework delineates the IL corresponding to the cognitive abilities required to use LLM into two distinct stages: Exploration & Action and Creation & Metacognition.
arXiv Detail & Related papers (2025-03-25T08:23:49Z) - LLMs as Educational Analysts: Transforming Multimodal Data Traces into Actionable Reading Assessment Reports [6.523137821124204]
This study investigates the use of multimodal data sources to derive meaningful reading insights.<n>We employ unsupervised learning techniques to identify distinct reading behavior patterns.<n>A large language model (LLM) synthesizes the derived information into actionable reports for educators.
arXiv Detail & Related papers (2025-03-03T22:34:08Z) - When LLMs Learn to be Students: The SOEI Framework for Modeling and Evaluating Virtual Student Agents in Educational Interaction [12.070907646464537]
We propose the SOEI framework for constructing and evaluating personality-aligned Virtual Student Agents (LVSAs) in classroom scenarios.<n>We generate five LVSAs based on Big Five traits through LoRA fine-tuning and expert-informed prompt design.<n>Our results provide: (1) an educationally and psychologically grounded generation pipeline for LLM-based student agents; (2) a hybrid, scalable evaluation framework for behavioral realism; and (3) empirical insights into the pedagogical utility of LVSAs in shaping instructional adaptation.
arXiv Detail & Related papers (2024-10-21T07:18:24Z) - CLOVA: A Closed-Loop Visual Assistant with Tool Usage and Update [69.59482029810198]
CLOVA is a Closed-Loop Visual Assistant that operates within a framework encompassing inference, reflection, and learning phases.
Results demonstrate that CLOVA surpasses existing tool-usage methods by 5% in visual question answering and multiple-image reasoning, by 10% in knowledge tagging, and by 20% in image editing.
arXiv Detail & Related papers (2023-12-18T03:34:07Z) - From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning [63.63840740526497]
We investigate how instruction tuning adjusts pre-trained models with a focus on intrinsic changes.
The impact of instruction tuning is then studied by comparing the explanations derived from the pre-trained and instruction-tuned models.
Our findings reveal three significant impacts of instruction tuning.
arXiv Detail & Related papers (2023-09-30T21:16:05Z) - Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions [126.3136109870403]
We introduce a generic and lightweight Visual Prompt Generator Complete module (VPG-C)
VPG-C infers and completes the missing details essential for comprehending demonstrative instructions.
We build DEMON, a comprehensive benchmark for demonstrative instruction understanding.
arXiv Detail & Related papers (2023-08-08T09:32:43Z) - Multimodal Lecture Presentations Dataset: Understanding Multimodality in
Educational Slides [57.86931911522967]
We test the capabilities of machine learning models in multimodal understanding of educational content.
Our dataset contains aligned slides and spoken language, for 180+ hours of video and 9000+ slides, with 10 lecturers from various subjects.
We introduce PolyViLT, a multimodal transformer trained with a multi-instance learning loss that is more effective than current approaches.
arXiv Detail & Related papers (2022-08-17T05:30:18Z) - Object Relational Graph with Teacher-Recommended Learning for Video
Captioning [92.48299156867664]
We propose a complete video captioning system including both a novel model and an effective training strategy.
Specifically, we propose an object relational graph (ORG) based encoder, which captures more detailed interaction features to enrich visual representation.
Meanwhile, we design a teacher-recommended learning (TRL) method to make full use of the successful external language model (ELM) to integrate the abundant linguistic knowledge into the caption model.
arXiv Detail & Related papers (2020-02-26T15:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.