Scalable and Equitable Math Problem Solving Strategy Prediction in Big
Educational Data
- URL: http://arxiv.org/abs/2308.03892v1
- Date: Mon, 7 Aug 2023 19:51:10 GMT
- Title: Scalable and Equitable Math Problem Solving Strategy Prediction in Big
Educational Data
- Authors: Anup Shakya, Vasile Rus, Deepak Venugopal
- Abstract summary: We develop an embedding called MVec where we learn a representation based on the mastery of students.
We then cluster these embeddings with a non-parametric clustering method.
We show that our approach can scale up to achieve high accuracy by training on a small sample of a large dataset.
- Score: 2.86829428083307
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding a student's problem-solving strategy can have a significant
impact on effective math learning using Intelligent Tutoring Systems (ITSs) and
Adaptive Instructional Systems (AISs). For instance, the ITS/AIS can better
personalize itself to correct specific misconceptions that are indicated by
incorrect strategies, specific problems can be designed to improve strategies
and frustration can be minimized by adapting to a student's natural way of
thinking rather than trying to fit a standard strategy for all. While it may be
possible for human experts to identify strategies manually in classroom
settings with sufficient student interaction, it is not possible to scale this
up to big data. Therefore, we leverage advances in Machine Learning and AI
methods to perform scalable strategy prediction that is also fair to students
at all skill levels. Specifically, we develop an embedding called MVec where we
learn a representation based on the mastery of students. We then cluster these
embeddings with a non-parametric clustering method where we progressively learn
clusters such that we group together instances that have approximately
symmetrical strategies. The strategy prediction model is trained on instances
sampled from these clusters. This ensures that we train the model over diverse
strategies and also that strategies from a particular group do not bias the DNN
model, thus allowing it to optimize its parameters over all groups. Using real
world large-scale student interaction datasets from MATHia, we implement our
approach using transformers and Node2Vec for learning the mastery embeddings
and LSTMs for predicting strategies. We show that our approach can scale up to
achieve high accuracy by training on a small sample of a large dataset and also
has predictive equality, i.e., it can predict strategies equally well for
learners at diverse skill levels.
Related papers
- Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents.
Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z) - Learnability Gaps of Strategic Classification [68.726857356532]
We focus on addressing a fundamental question: the learnability gaps between strategic classification and standard learning.
We provide nearly tight sample complexity and regret bounds, offering significant improvements over prior results.
Notably, our algorithm in this setting is of independent interest and can be applied to other problems such as multi-label learning.
arXiv Detail & Related papers (2024-02-29T16:09:19Z) - Discriminative Adversarial Unlearning [40.30974185546541]
We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm.
We capitalize on the capabilities of strong Membership Inference Attacks (MIA) to facilitate the unlearning of specific samples from a trained model.
Our proposed algorithm closely approximates the ideal benchmark of retraining from scratch for both random sample forgetting and class-wise forgetting schemes.
arXiv Detail & Related papers (2024-02-10T03:04:57Z) - Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction [1.1049608786515839]
We learn a representation based on Node2Vec that encodes symmetries over mastery or skill level.
We apply our model to learn strategies for Math learning from large-scale datasets from MATHia.
arXiv Detail & Related papers (2024-01-04T17:57:21Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Fast Context Adaptation in Cost-Aware Continual Learning [10.515324071327903]
5G and Beyond networks require more complex learning agents and the learning process itself might end up competing with users for communication and computational resources.
This creates friction: on the one hand, the learning process needs resources to quickly convergence to an effective strategy; on the other hand, the learning process needs to be efficient, i.e. take as few resources as possible from the user's data plane, so as not to throttle users' resources.
In this paper, we propose a dynamic strategy to balance the resources assigned to the data plane and those reserved for learning.
arXiv Detail & Related papers (2023-06-06T17:46:48Z) - A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised
Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution.
Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences.
We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z) - Nonparametric Estimation of Heterogeneous Treatment Effects: From Theory
to Learning Algorithms [91.3755431537592]
We analyze four broad meta-learning strategies which rely on plug-in estimation and pseudo-outcome regression.
We highlight how this theoretical reasoning can be used to guide principled algorithm design and translate our analyses into practice.
arXiv Detail & Related papers (2021-01-26T17:11:40Z) - Curriculum Learning with Diversity for Supervised Computer Vision Tasks [1.5229257192293197]
We introduce a novel curriculum sampling strategy which takes into consideration the diversity of the training data together with the difficulty of the inputs.
We prove that our strategy is very efficient for unbalanced data sets, leading to faster convergence and more accurate results.
arXiv Detail & Related papers (2020-09-22T15:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.