Discovering an Aid Policy to Minimize Student Evasion Using Offline
Reinforcement Learning
- URL: http://arxiv.org/abs/2104.10258v1
- Date: Tue, 20 Apr 2021 21:45:19 GMT
- Title: Discovering an Aid Policy to Minimize Student Evasion Using Offline
Reinforcement Learning
- Authors: Leandro M. de Lima, Renato A. Krohling
- Abstract summary: We propose a decision support method to the selection of aid actions for students using offline reinforcement learning.
Our experiments using logged data of real students shows, through off-policy evaluation, that the method should achieve roughly 1.0 to 1.5 times as much cumulative reward as the logged policy.
- Score: 2.2344764434954256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High dropout rates in tertiary education expose a lack of efficiency that
causes frustration of expectations and financial waste. Predicting students at
risk is not enough to avoid student dropout. Usually, an appropriate aid action
must be discovered and applied in the proper time for each student. To tackle
this sequential decision-making problem, we propose a decision support method
to the selection of aid actions for students using offline reinforcement
learning to support decision-makers effectively avoid student dropout.
Additionally, a discretization of student's state space applying two different
clustering methods is evaluated. Our experiments using logged data of real
students shows, through off-policy evaluation, that the method should achieve
roughly 1.0 to 1.5 times as much cumulative reward as the logged policy. So, it
is feasible to help decision-makers apply appropriate aid actions and,
possibly, reduce student dropout.
Related papers
- Distantly-Supervised Named Entity Recognition with Adaptive Teacher
Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models.
In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks.
Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z) - When to Ask for Help: Proactive Interventions in Autonomous
Reinforcement Learning [57.53138994155612]
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.
A critical challenge is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.
We propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them.
arXiv Detail & Related papers (2022-10-19T17:57:24Z) - A Framework for Undergraduate Data Collection Strategies for Student
Support Recommendation Systems in Higher Education [12.358921226358133]
This paper outlines a data collection framework specific to recommender systems within higher education.
The purpose of this paper is to outline a data collection framework specific to recommender systems within this context.
arXiv Detail & Related papers (2022-10-16T13:39:11Z) - Enhancing a Student Productivity Model for Adaptive Problem-Solving
Assistance [7.253181280137071]
We present a novel data-driven approach to incorporate students' hint usage in predicting their need for help.
We show empirical evidence to support that such a policy can save students a significant amount of time in training.
We conclude with suggestions on the domains that can benefit from this approach as well as the requirements for adoption.
arXiv Detail & Related papers (2022-07-07T00:41:00Z) - Plagiarism deterrence for introductory programming [11.612194979331179]
A class-wide statistical characterization can be clearly shared with students via an intuitive new p-value.
A pairwise, compression-based similarity detection algorithm captures relationships between assignments more accurately.
An unbiased scoring system aids students and the instructor in understanding true independence of effort.
arXiv Detail & Related papers (2022-06-06T18:47:25Z) - The Paradox of Choice: Using Attention in Hierarchical Reinforcement
Learning [59.777127897688594]
We present an online, model-free algorithm to learn affordances that can be used to further learn subgoal options.
We investigate the role of hard versus soft attention in training data collection, abstract value learning in long-horizon tasks, and handling a growing number of choices.
arXiv Detail & Related papers (2022-01-24T13:18:02Z) - Deterministic and Discriminative Imitation (D2-Imitation): Revisiting
Adversarial Imitation for Sample Efficiency [61.03922379081648]
We propose an off-policy sample efficient approach that requires no adversarial training or min-max optimization.
Our empirical results show that D2-Imitation is effective in achieving good sample efficiency, outperforming several off-policy extension approaches of adversarial imitation.
arXiv Detail & Related papers (2021-12-11T19:36:19Z) - Off-policy Reinforcement Learning with Optimistic Exploration and
Distribution Correction [73.77593805292194]
We train a separate exploration policy to maximize an approximate upper confidence bound of the critics in an off-policy actor-critic framework.
To mitigate the off-policy-ness, we adapt the recently introduced DICE framework to learn a distribution correction ratio for off-policy actor-critic training.
arXiv Detail & Related papers (2021-10-22T22:07:51Z) - Extending the Hint Factory for the assistance dilemma: A novel,
data-driven HelpNeed Predictor for proactive problem-solving help [6.188683567894372]
We present a set of data-driven methods to classify, predict, and prevent unproductive problem-solving steps.
We present a HelpNeed classification, that uses prior student data to determine when students are likely to be unproductive.
We conclude with suggestions on how these HelpNeed methods could be applied in other well-structured open-ended domains.
arXiv Detail & Related papers (2020-10-08T17:04:03Z) - Student-Initiated Action Advising via Advice Novelty [0.14323566945483493]
Student-initiated techniques that utilise state novelty and uncertainty estimations have obtained promising results.
We propose a student-initiated algorithm that alleviates these by employing Random Network Distillation (RND) to measure the novelty of a piece of advice.
arXiv Detail & Related papers (2020-10-01T13:20:28Z) - Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with
Latent Confounders [62.54431888432302]
We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders.
We show how, given only a latent variable model for states and actions, policy value can be identified from off-policy data.
arXiv Detail & Related papers (2020-07-27T22:19:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.