Analyzing the Capabilities of Nature-inspired Feature Selection
Algorithms in Predicting Student Performance
- URL: http://arxiv.org/abs/2308.08574v2
- Date: Sat, 7 Oct 2023 23:55:29 GMT
- Title: Analyzing the Capabilities of Nature-inspired Feature Selection
Algorithms in Predicting Student Performance
- Authors: Thomas Trask
- Abstract summary: In this paper, an analysis was conducted to determine the relative performance of a suite of nature-inspired algorithms in the feature-selection portion of ensemble algorithms used to predict student performance.
It was found that leveraging an ensemble approach using nature-inspired algorithms for feature selection and traditional ML algorithms for classification significantly increased predictive accuracy while also reducing feature set size by up to 65 percent.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting student performance is key in leveraging effective pre-failure
interventions for at-risk students. As educational data grows larger, more
effective means of analyzing student data in a timely manner are needed in
order to provide useful predictions and interventions. In this paper, an
analysis was conducted to determine the relative performance of a suite of
nature-inspired algorithms in the feature-selection portion of ensemble
algorithms used to predict student performance. A Swarm Intelligence ML engine
(SIMLe) was developed to run this suite in tandem with a series of traditional
ML classification algorithms to analyze three student datasets: instance-based
clickstream data, hybrid single-course performance, and student
meta-performance when taking multiple courses simultaneously. These results
were then compared to previous predictive algorithms and, for all datasets
analyzed, it was found that leveraging an ensemble approach using
nature-inspired algorithms for feature selection and traditional ML algorithms
for classification significantly increased predictive accuracy while also
reducing feature set size by up to 65 percent.
Related papers
- Utilizing Data Fingerprints for Privacy-Preserving Algorithm Selection in Time Series Classification: Performance and Uncertainty Estimation on Unseen Datasets [4.2193475197905705]
We introduce a novel data fingerprint that describes any time series classification dataset in a privacy-preserving manner.
By decomposing the multi-target regression problem, only our data fingerprints are used to estimate algorithm performance and uncertainty.
Our approach is evaluated on the 112 University of California riverside benchmark datasets.
arXiv Detail & Related papers (2024-09-13T08:43:42Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Improving prediction of students' performance in intelligent tutoring systems using attribute selection and ensembles of different multimodal data sources [0.0]
The aim of this study was to predict university students' learning performance using different sources of data from an Intelligent Tutoring System.
We collected and preprocessed data from 40 students from different multimodal sources.
arXiv Detail & Related papers (2024-02-10T09:31:39Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Fair Feature Subset Selection using Multiobjective Genetic Algorithm [0.0]
We present a feature subset selection approach that improves both fairness and accuracy objectives.
We use statistical disparity as a fairness metric and F1-Score as a metric for model performance.
Our experiments on the most commonly used fairness benchmark datasets show that using the evolutionary algorithm we can effectively explore the trade-off between fairness and accuracy.
arXiv Detail & Related papers (2022-04-30T22:51:19Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Learning Predictions for Algorithms with Predictions [49.341241064279714]
We introduce a general design approach for algorithms that learn predictors.
We apply techniques from online learning to learn against adversarial instances, tune robustness-consistency trade-offs, and obtain new statistical guarantees.
We demonstrate the effectiveness of our approach at deriving learning algorithms by analyzing methods for bipartite matching, page migration, ski-rental, and job scheduling.
arXiv Detail & Related papers (2022-02-18T17:25:43Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Generalization in portfolio-based algorithm selection [97.74604695303285]
We provide the first provable guarantees for portfolio-based algorithm selection.
We show that if the portfolio is large, overfitting is inevitable, even with an extremely simple algorithm selector.
arXiv Detail & Related papers (2020-12-24T16:33:17Z) - Computational Models for Academic Performance Estimation [21.31653695065347]
This paper presents an in-depth analysis of deep learning and machine learning approaches for the formulation of an automated students' performance estimation system.
Our main contributions are (a) a large dataset with fifteen courses (shared publicly for academic research) (b) statistical analysis and ablations on the estimation problem for this dataset.
Unlike previous approaches that rely on feature engineering or logical function deduction, our approach is fully data-driven and thus highly generic with better performance across different prediction tasks.
arXiv Detail & Related papers (2020-09-06T07:31:37Z) - Multi-split Optimized Bagging Ensemble Model Selection for Multi-class
Educational Data Mining [8.26773636337474]
This work analyzes two different undergraduate datasets at two different universities.
It aims to predict the students' performance at two stages of course delivery (20% and 50% respectively)
arXiv Detail & Related papers (2020-06-09T03:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.