Automatically Assessing Students Performance with Smartphone Data
- URL: http://arxiv.org/abs/2209.05596v1
- Date: Wed, 6 Jul 2022 10:05:23 GMT
- Title: Automatically Assessing Students Performance with Smartphone Data
- Authors: J. Fernandes, J. S\'a Silva, A. Rodrigues, S. Sinche, F. Boavida
- Abstract summary: We present a dataset collected using a smartphone application (ISABELA)
We present several tests with different machine learning models, in order to classify students' performance.
It is shown that the created models can predict student performance even with data collected from different contexts.
- Score: 0.7069200904392647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the number of smart devices that surround us increases, so do the
opportunities to create smart socially-aware systems. In this context, mobile
devices can be used to collect data about students and to better understand how
their day-to-day routines can influence their academic performance. Moreover,
the Covid-19 pandemic led to new challenges and difficulties, also for
students, with considerable impact on their lifestyle. In this paper we present
a dataset collected using a smartphone application (ISABELA), which include
passive data (e.g., activity and location) as well as self-reported data from
questionnaires. We present several tests with different machine learning
models, in order to classify students' performance. These tests were carried
out using different time windows, showing that weekly time windows lead to
better prediction and classification results than monthly time windows.
Furthermore, it is shown that the created models can predict student
performance even with data collected from different contexts, namely before and
during the Covid-19 pandemic. SVMs, XGBoost and AdaBoost-SAMME with Random
Forest were found to be the best algorithms, showing an accuracy greater than
78%. Additionally, we propose a pipeline that uses a decision level median
voting algorithm to further improve the models' performance, by using historic
data from the students to further improve the prediction. Using this pipeline,
it is possible to further increase the performance of the models, with some of
them obtaining an accuracy greater than 90%.
Related papers
- LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Multi-granulariy Time-based Transformer for Knowledge Tracing [9.788039182463768]
We leverage students historical data, including their past test scores, to create a personalized model for each student.
We then use these models to predict their future performance on a given test.
arXiv Detail & Related papers (2023-04-11T14:46:38Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Quality of Data in Machine Learning [3.9998518782208774]
The study refutes the starting assumption and continues to state that in this case the significance in data lies in the quality of the data instead of the quantity of the data.
arXiv Detail & Related papers (2021-12-17T09:22:46Z) - SelfHAR: Improving Human Activity Recognition through Self-training with
Unlabeled Data [9.270269467155547]
SelfHAR is a semi-supervised model that learns to leverage unlabeled datasets to complement small labeled datasets.
Our approach combines teacher-student self-training, which distills the knowledge of unlabeled and labeled datasets.
SelfHAR is data-efficient, reaching similar performance using up to 10 times less labeled data compared to supervised approaches.
arXiv Detail & Related papers (2021-02-11T15:40:35Z) - Predicting student performance using data from an auto-grading system [0.0]
We build decision-tree and linear-regression models with various features extracted from the Marmoset auto-grading system.
We show that the linear-regression model using submission time intervals performs the best among all models in terms of Precision and F-Measure.
We also show that for students who are misclassified into poor-performance students, they have the lowest actual grades in the linear-regression model among all models.
arXiv Detail & Related papers (2021-02-02T03:02:39Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.