Related papers: An Automated Explainable Educational Assessment System Built on LLMs

An Automated Explainable Educational Assessment System Built on LLMs

URL: http://arxiv.org/abs/2412.13381v1
Date: Tue, 17 Dec 2024 23:29:18 GMT
Title: An Automated Explainable Educational Assessment System Built on LLMs
Authors: Jiazheng Li, Artem Bobrov, David West, Cesare Aloisi, Yulan He,
Abstract summary: AERA Chat is an automated educational assessment system designed for interactive and visual evaluations of student responses.<n>Our system allows users to input questions and student answers, providing educators and researchers with insights into assessment accuracy.
Score: 12.970776782360366
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this demo, we present AERA Chat, an automated and explainable educational assessment system designed for interactive and visual evaluations of student responses. This system leverages large language models (LLMs) to generate automated marking and rationale explanations, addressing the challenge of limited explainability in automated educational assessment and the high costs associated with annotation. Our system allows users to input questions and student answers, providing educators and researchers with insights into assessment accuracy and the quality of LLM-assessed rationales. Additionally, it offers advanced visualization and robust evaluation tools, enhancing the usability for educational assessment and facilitating efficient rationale verification. Our demo video can be found at https://youtu.be/qUSjz-sxlBc.

Related papers

Teaching at Scale: Leveraging AI to Evaluate and Elevate Engineering Education [3.557803321422781]
This article presents a scalable, AI-supported framework for qualitative student feedback using large language models.<n>The system employs hierarchical summarization, anonymization, and exception handling to extract actionable themes from open-ended comments.<n>We report on its successful deployment across a large college of engineering.
arXiv Detail & Related papers (2025-08-01T20:27:40Z)
Transforming Student Evaluation with Adaptive Intelligence and Performance Analytics [0.0]
This paper creates a system for the evaluation of students performance using Artificial intelligence. There are formats of questions in the system which comprises multiple choice, short answers and descriptive questions. The platform has intelligent learning progressions where the user will be able to monitor his/her performances to be recommended a certain level of performance.
arXiv Detail & Related papers (2025-02-07T18:57:51Z)
A Zero-Shot LLM Framework for Automatic Assignment Grading in Higher Education [0.6141800972050401]
We propose a Zero-Shot Large Language Model (LLM)-Based Automated Assignment Grading (AAG) system. This framework leverages prompt engineering to evaluate both computational and explanatory student responses without requiring additional training or fine-tuning. The AAG system delivers tailored feedback that highlights individual strengths and areas for improvement, thereby enhancing student learning outcomes.
arXiv Detail & Related papers (2025-01-24T08:01:41Z)
Human-Centered Design for AI-based Automatically Generated Assessment Reports: A Systematic Review [4.974197456441281]
This study emphasizes the importance of reducing teachers' cognitive demands through user-centered and intuitive designs. It highlights the potential of diverse information presentation formats such as text, visual aids, and plots and advanced functionalities such as live and interactive features to enhance usability. The framework aims to address challenges in engaging teachers with technology-enhanced assessment results, facilitating data-driven decision-making, and providing personalized feedback to improve the teaching and learning process.
arXiv Detail & Related papers (2024-12-30T16:20:07Z)
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? [65.92331309449015]
We introduce AutoBench-V, an automated framework for serving evaluation on demand, i.e., benchmarking LVLMs based on specific aspects of model capability. Through an extensive evaluation of nine popular LVLMs across five demanded user inputs, the framework shows effectiveness and reliability.
arXiv Detail & Related papers (2024-10-28T17:55:08Z)
AERA Chat: An Interactive Platform for Automated Explainable Student Answer Assessment [12.970776782360366]
AERA Chat is an interactive platform to provide visually explained assessment of student answers. Users can input questions and student answers to obtain automated, explainable assessment results from large language models.
arXiv Detail & Related papers (2024-10-12T11:57:53Z)
Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course [49.296957552006226]
Using large language models (LLMs) for automatic evaluation has become an important evaluation method in NLP research. This report shares how we use GPT-4 as an automatic assignment evaluator in a university course with 1,028 students.
arXiv Detail & Related papers (2024-07-07T00:17:24Z)
LOVA3: Learning to Visual Question Answering, Asking and Assessment [61.51687164769517]
Question answering, asking, and assessment are three innate human traits crucial for understanding the world and acquiring knowledge. Current Multimodal Large Language Models (MLLMs) primarily focus on question answering, often neglecting the full potential of questioning and assessment skills. We introduce LOVA3, an innovative framework named "Learning tO Visual question Answering, Asking and Assessment"
arXiv Detail & Related papers (2024-05-23T18:21:59Z)
Lessons Learned from Designing an Open-Source Automated Feedback System for STEM Education [5.326069675013602]
We present RATsApp, an open-source automated feedback system (AFS) that incorporates research-based features such as formative feedback. The system focuses on core STEM competencies such as mathematical competence, representational competence, and data literacy. As an open-source platform, RATsApp encourages public contributions to its ongoing development, fostering a collaborative approach to improve educational tools.
arXiv Detail & Related papers (2024-01-19T07:13:07Z)
Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs) The system is into three inter-connected core processes-interaction, reflection, and reaction. Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z)
Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach [40.06500618820166]
This paper presents an approach to deriving a learner model directly from an assessment rubric. We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills.
arXiv Detail & Related papers (2022-09-07T10:09:12Z)
Are Akpans Trick or Treat: Unveiling Helpful Biases in Assistant Systems [55.09907990139756]
Information-seeking AI assistant systems aim to answer users' queries about knowledge in a timely manner. In this paper, we study computational measurements of helpfulness. Experiments with state-of-the-art dialogue systems reveal that existing systems tend to be more helpful for questions regarding concepts from highly-developed countries.
arXiv Detail & Related papers (2022-05-25T07:58:38Z)
Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach [84.02388020258141]
We propose a new framework named ENIGMA for estimating human evaluation scores based on off-policy evaluation in reinforcement learning. ENIGMA only requires a handful of pre-collected experience data, and therefore does not involve human interaction with the target policy during the evaluation. Our experiments show that ENIGMA significantly outperforms existing methods in terms of correlation with human evaluation scores.
arXiv Detail & Related papers (2021-02-20T03:29:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.