Related papers: Perspective on Code Submission and Automated Evaluation Platforms for University Teaching

Perspective on Code Submission and Automated Evaluation Platforms for University Teaching

URL: http://arxiv.org/abs/2201.13222v1
Date: Tue, 25 Jan 2022 10:06:45 GMT
Title: Perspective on Code Submission and Automated Evaluation Platforms for University Teaching
Authors: Florian Auer, Johann Frei, Dominik M\"uller and Frank Kramer
Abstract summary: We present a perspective on platforms for code submission and automated evaluation in the context of university teaching. We identify relevant technical and non-technical requirements for such platforms in terms of practical applicability and secure code submission environments. We conclude that submission and automated evaluation involves continuous maintenance yet lowers the required workload for teachers and provides better evaluation transparency for students.
Score: 1.6172800007896284
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a perspective on platforms for code submission and automated evaluation in the context of university teaching. Due to the COVID-19 pandemic, such platforms have become an essential asset for remote courses and a reasonable standard for structured code submission concerning increasing numbers of students in computer sciences. Utilizing automated code evaluation techniques exhibits notable positive impacts for both students and teachers in terms of quality and scalability. We identified relevant technical and non-technical requirements for such platforms in terms of practical applicability and secure code submission environments. Furthermore, a survey among students was conducted to obtain empirical data on general perception. We conclude that submission and automated evaluation involves continuous maintenance yet lowers the required workload for teachers and provides better evaluation transparency for students.

Related papers

Evaluating Machine Expertise: How Graduate Students Develop Frameworks for Assessing GenAI Content [1.967444231154626]
This paper examines how graduate students develop frameworks for evaluating machine-generated expertise in web-based interactions with large language models (LLMs) Our findings reveal that students construct evaluation frameworks shaped by three main factors: professional identity, verification capabilities, and system navigation experience.
arXiv Detail & Related papers (2025-04-24T22:24:14Z)
An Online Integrated Development Environment for Automated Programming Assessment Systems [4.618037115403291]
This research contributes to the field of programming education by extracting and defining requirements for an online IDE. The usability of the new online IDE was assessed using the Technology Acceptance Model (TAM), gathering feedback from 27 first-year students.
arXiv Detail & Related papers (2025-03-17T12:50:51Z)
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors [76.1634959528817]
We present MathTutorBench, an open-source benchmark for holistic tutoring model evaluation. MathTutorBench contains datasets and metrics that broadly cover tutor abilities as defined by learning sciences research in dialog-based teaching. We evaluate a wide set of closed- and open-weight models and find that subject expertise, indicated by solving ability, does not immediately translate to good teaching.
arXiv Detail & Related papers (2025-02-26T08:43:47Z)
Human-Centered Design for AI-based Automatically Generated Assessment Reports: A Systematic Review [4.974197456441281]
This study emphasizes the importance of reducing teachers' cognitive demands through user-centered and intuitive designs. It highlights the potential of diverse information presentation formats such as text, visual aids, and plots and advanced functionalities such as live and interactive features to enhance usability. The framework aims to address challenges in engaging teachers with technology-enhanced assessment results, facilitating data-driven decision-making, and providing personalized feedback to improve the teaching and learning process.
arXiv Detail & Related papers (2024-12-30T16:20:07Z)
GUI Agents: A Survey [129.94551809688377]
Graphical User Interface (GUI) agents, powered by Large Foundation Models, have emerged as a transformative approach to automating human-computer interaction. Motivated by the growing interest and fundamental importance of GUI agents, we provide a comprehensive survey that categorizes their benchmarks, evaluation metrics, architectures, and training methods.
arXiv Detail & Related papers (2024-12-18T04:48:28Z)
Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset [94.13848736705575]
We introduce Facial Identity Unlearning Benchmark (FIUBench), a novel VLM unlearning benchmark designed to robustly evaluate the effectiveness of unlearning algorithms. We apply a two-stage evaluation pipeline that is designed to precisely control the sources of information and their exposure levels. Through the evaluation of four baseline VLM unlearning algorithms within FIUBench, we find that all methods remain limited in their unlearning performance.
arXiv Detail & Related papers (2024-11-05T23:26:10Z)
AERA Chat: An Interactive Platform for Automated Explainable Student Answer Assessment [12.970776782360366]
AERA Chat is an interactive platform to provide visually explained assessment of student answers. Users can input questions and student answers to obtain automated, explainable assessment results from large language models.
arXiv Detail & Related papers (2024-10-12T11:57:53Z)
A Benchmark for Fairness-Aware Graph Learning [58.515305543487386]
We present an extensive benchmark on ten representative fairness-aware graph learning methods. Our in-depth analysis reveals key insights into the strengths and limitations of existing methods.
arXiv Detail & Related papers (2024-07-16T18:43:43Z)
Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence [6.638206014723678]
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor.
arXiv Detail & Related papers (2024-01-29T12:11:30Z)
A Design and Development of Rubrics System for Android Applications [0.0]
This application aims to provide an user-friendly interface for viewing the students performance. Our application promises to make the grading system easier and to enhance the effectiveness in terms of time and resources.
arXiv Detail & Related papers (2023-09-23T16:14:27Z)
UKP-SQuARE: An Interactive Tool for Teaching Question Answering [61.93372227117229]
The exponential growth of question answering (QA) has made it an indispensable topic in any Natural Language Processing (NLP) course. We introduce UKP-SQuARE as a platform for QA education. Students can run, compare, and analyze various QA models from different perspectives.
arXiv Detail & Related papers (2023-05-31T11:29:04Z)
Building an Effective Automated Assessment System for C/C++ Introductory Programming Courses in ODL Environment [0.0]
Traditional ways of assessing students' work are becoming insufficient in terms of both time and effort. In distance education environment, such assessments become additionally more challenging in terms of hefty remuneration for hiring large number of tutors. We identify different components that we believe are necessary in building an effective automated assessment system.
arXiv Detail & Related papers (2022-05-24T09:20:43Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
Hierarchical Bi-Directional Self-Attention Networks for Paper Review Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation. Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three) We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z)
Value Cards: An Educational Toolkit for Teaching Social Impacts of Machine Learning through Deliberation [32.74513588794863]
Value Card is an educational toolkit to inform students and practitioners of the social impacts of different machine learning models via deliberation. Our results suggest that the use of the Value Cards toolkit can improve students' understanding of both the technical definitions and trade-offs of performance metrics.
arXiv Detail & Related papers (2020-10-22T03:27:19Z)
SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning [98.2036247050674]
We show that evaluating the learned representations with a self-supervised image rotation task is highly correlated with a standard set of supervised evaluations. We provide an algorithm (SelfAugment) to automatically and efficiently select augmentation policies without using supervised evaluations.
arXiv Detail & Related papers (2020-09-16T14:49:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.