Related papers: Refining Student Marks based on Enrolled Modules Assessment Methods using Data Mining Techniques

Refining Student Marks based on Enrolled Modules Assessment Methods using Data Mining Techniques

URL: http://arxiv.org/abs/2009.06381v1
Date: Sun, 30 Aug 2020 19:47:45 GMT
Title: Refining Student Marks based on Enrolled Modules Assessment Methods using Data Mining Techniques
Authors: Mohammed A. Alsuwaiket, Anas H. Blasi, Khawla Altarawneh
Abstract summary: We propose a different data preparation process by investigating more than 230000 student records for the preparation of scores. The effect of Module Assessment Index on the prediction process using Random Forest and Naive Bayes classification techniques were investigated.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Choosing the right and effective way to assess students is one of the most important tasks of higher education. Many studies have shown that students tend to receive higher scores during their studies when assessed by different study methods which include units that are fully assessed by varying the duration of study or a combination of courses and exams than by exams alone. Many Educational Data Mining studies process data in advance through traditional data extraction, including the data preparation process. In this paper, we propose a different data preparation process by investigating more than 230000 student records for the preparation of scores. The data have been processed through diverse stages in order to extract a categorical factor through which students module marks are refined during the data preparation stage. The results of this work show that students final marks should not be isolated from the nature of the enrolled module assessment methods. They must rather be investigated thoroughly and considered during EDM data preprocessing stage. More generally, educational data should not be prepared in the same way normal data are due to the differences in data sources, applications, and error types. The effect of Module Assessment Index on the prediction process using Random Forest and Naive Bayes classification techniques were investigated. It was shown that considering MAI as attribute increases the accuracy of predicting students second year averages based on their first year averages.

Related papers

Ranking-Based At-Risk Student Prediction Using Federated Learning and Differential Features [4.21051987964486]
This study proposes a method that combines federated learning and differential features to address privacy concerns.<n>To evaluate the proposed method, a model for predicting at-risk students was trained using data from 1,136 students across 12 courses conducted over 4 years.<n>The trained models were also applicable for early prediction, achieving high performance in detecting at-risk students in earlier stages of the semester.
arXiv Detail & Related papers (2025-05-14T11:12:30Z)
Detecting Training Data of Large Language Models via Expectation Maximization [62.28028046993391]
We introduce EM-MIA, a novel membership inference method that iteratively refines membership scores and prefix scores via an expectation-maximization algorithm. EM-MIA achieves state-of-the-art results on WikiMIA.
arXiv Detail & Related papers (2024-10-10T03:31:16Z)
A Fair Post-Processing Method based on the MADD Metric for Predictive Student Models [1.055551340663609]
A new metric has been developed to evaluate algorithmic fairness in predictive student models. In this paper, we develop a post-processing method that aims at improving the fairness while preserving the accuracy of relevant predictive models' results. We experiment with our approach on the task of predicting student success in an online course, using both simulated and real-world educational data.
arXiv Detail & Related papers (2024-07-07T14:53:41Z)
Extracting Training Data from Unconditional Diffusion Models [76.85077961718875]
diffusion probabilistic models (DPMs) are being employed as mainstream models for generative artificial intelligence (AI) We aim to establish a theoretical understanding of memorization in DPMs with 1) a memorization metric for theoretical analysis, 2) an analysis of conditional memorization with informative and random labels, and 3) two better evaluation metrics for measuring memorization. Based on the theoretical analysis, we propose a novel data extraction method called textbfSurrogate condItional Data Extraction (SIDE) that leverages a trained on generated data as a surrogate condition to extract training data directly from unconditional diffusion models.
arXiv Detail & Related papers (2024-06-18T16:20:12Z)
A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset. Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive. Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z)
How to Train Data-Efficient LLMs [56.41105687693619]
We study data-efficient approaches for pre-training language models (LLMs) We find that Ask-LLM and Density sampling are the best methods in their respective categories. In our comparison of 19 samplers, involving hundreds of evaluation tasks and pre-training runs, we find that Ask-LLM and Density are the best methods in their respective categories.
arXiv Detail & Related papers (2024-02-15T02:27:57Z)
Students Success Modeling: Most Important Factors [0.47829670123819784]
The model undertakes to identify students likely to graduate, the ones likely to transfer to a different school, and the ones likely to drop out and leave their higher education unfinished. Our experiments demonstrate that distinguishing between to-be-graduate and at-risk students is reasonably achievable in the earliest stages. The model remarkably foresees the fate of students who stay in the school for three years.
arXiv Detail & Related papers (2023-09-06T19:23:10Z)
A Predictive Model using Machine Learning Algorithm in Identifying Students Probability on Passing Semestral Course [0.0]
This study employs classification for data mining techniques, and decision tree for algorithm. With the utilization of the newly discovered predictive model, the prediction of students probabilities to pass the current courses they take gives 0.7619 accuracy, 0.8333 precision, 0.8823 recall, and 0.8571 f1 score.
arXiv Detail & Related papers (2023-04-12T01:57:08Z)
Revisiting Long-tailed Image Classification: Survey and Benchmarks with New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution. Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z)
Multi-Layer Personalized Federated Learning for Mitigating Biases in Student Predictive Analytics [8.642174401125263]
We propose a Multi-Layer Personalized Federated Learning (MLPFL) methodology to optimize inference accuracy over different layers of student grouping criteria. In our approach, personalized models for individual student subgroups are derived from a global model. Experiments on three real-world online course datasets show significant improvements achieved by our approach over existing student modeling benchmarks.
arXiv Detail & Related papers (2022-12-05T17:27:28Z)
Process-BERT: A Framework for Representation Learning on Educational Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data. Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data. We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z)
Formulating Module Assessment for Improved Academic Performance Predictability in Higher Education [0.0]
This paper proposes a different data preparation process through investigating more than 230000 student records. The effect of CAR on prediction process using the random forest classification technique has been investigated.
arXiv Detail & Related papers (2020-08-30T19:42:31Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.