An Approach to Detect Abnormal Submissions for CodeWorkout Dataset
- URL: http://arxiv.org/abs/2407.17475v1
- Date: Fri, 28 Jun 2024 00:26:15 GMT
- Title: An Approach to Detect Abnormal Submissions for CodeWorkout Dataset
- Authors: Alex Hicks, Yang Shi, Arun-Balajiee Lekshmi-Narayanan, Wei Yan, Samiha Marwan,
- Abstract summary: This paper presents a preliminary study to analyze log data with anomalies.
The goal of our work is to overcome the abnormal instances when modeling personalizable recommendations in programming learning environments.
- Score: 8.142354661558752
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Students interactions while solving problems in learning environments (i.e. log data) are often used to support students learning. For example, researchers use log data to develop systems that can provide students with personalized problem recommendations based on their knowledge level. However, anomalies in the students log data, such as cheating to solve programming problems, could introduce a hidden bias in the log data. As a result, these systems may provide inaccurate problem recommendations, and therefore, defeat their purpose. Classical cheating detection methods, such as MOSS, can be used to detect code plagiarism. However, these methods cannot detect other abnormal events such as a student gaming a system with multiple attempts of similar solutions to a particular programming problem. This paper presents a preliminary study to analyze log data with anomalies. The goal of our work is to overcome the abnormal instances when modeling personalizable recommendations in programming learning environments.
Related papers
- Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - Deep Learning for Anomaly Detection in Log Data: A Survey [3.508620069426877]
Self-learning anomaly detection techniques capture patterns in log data and report unexpected log event occurrences.
Deep learning neural networks for this purpose have been presented.
There exist many different architectures for deep learning and it is non-trivial to encode raw and unstructured log data.
arXiv Detail & Related papers (2022-07-08T10:58:28Z) - Experience Report: Deep Learning-based System Log Analysis for Anomaly
Detection [30.52620190783608]
We provide a review and evaluation on five popular models used by six state-of-the-art anomaly detectors.
Four of the selected methods are unsupervised and the remaining two are supervised.
We believe our work can serve as a basis in this field and contribute to the future academic researches and industrial applications.
arXiv Detail & Related papers (2021-07-13T08:10:47Z) - Making Sense of Moodle Log Data [2.66512000865131]
Risk of training machine learning algorithms on biased datasets is always around the corner.
This paper tries to focus on these issues showing some examples of learning log data extracted from Moodle.
arXiv Detail & Related papers (2021-06-16T10:15:09Z) - Toward Semi-Automatic Misconception Discovery Using Code Embeddings [4.369757255496184]
We present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses.
We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions.
arXiv Detail & Related papers (2021-03-07T20:32:41Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - A Novel Anomaly Detection Algorithm for Hybrid Production Systems based
on Deep Learning and Timed Automata [73.38551379469533]
DAD:DeepAnomalyDetection is a new approach for automatic model learning and anomaly detection in hybrid production systems.
It combines deep learning and timed automata for creating behavioral model from observations.
The algorithm has been applied to few data sets including two from real systems and has shown promising results.
arXiv Detail & Related papers (2020-10-29T08:27:43Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z) - A Review of Meta-level Learning in the Context of Multi-component,
Multi-level Evolving Prediction Systems [6.810856082577402]
The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data.
It requires deep expert knowledge and extensive computational resources to find the most appropriate mapping of learning methods for a given problem.
There is a need for an intelligent recommendation engine that can advise what is the best learning algorithm for a dataset.
arXiv Detail & Related papers (2020-07-17T14:14:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.