Process-BERT: A Framework for Representation Learning on Educational
Process Data
- URL: http://arxiv.org/abs/2204.13607v1
- Date: Thu, 28 Apr 2022 16:07:28 GMT
- Title: Process-BERT: A Framework for Representation Learning on Educational
Process Data
- Authors: Alexander Scarlatos, Christopher Brinton, Andrew Lan
- Abstract summary: We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
- Score: 68.8204255655161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Educational process data, i.e., logs of detailed student activities in
computerized or online learning platforms, has the potential to offer deep
insights into how students learn. One can use process data for many downstream
tasks such as learning outcome prediction and automatically delivering
personalized intervention. However, analyzing process data is challenging since
the specific format of process data varies a lot depending on different
learning/testing scenarios. In this paper, we propose a framework for learning
representations of educational process data that is applicable across many
different learning scenarios. Our framework consists of a pre-training step
that uses BERT-type objectives to learn representations from sequential process
data and a fine-tuning step that further adjusts these representations on
downstream prediction tasks. We apply our framework to the 2019 nation's report
card data mining competition dataset that consists of student problem-solving
process data and detail the specific models we use in this scenario. We conduct
both quantitative and qualitative experiments to show that our framework
results in process data representations that are both predictive and
informative.
Related papers
- LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection.
We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks.
Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - What Averages Do Not Tell -- Predicting Real Life Processes with
Sequential Deep Learning [0.1376408511310322]
Process Mining concerns discovering insights on business processes from their execution data that are logged by systems.
Many Deep Learning techniques have been successfully adapted for predictive Process Mining that aims to predict process outcomes.
Traces in Process Mining are multimodal sequences and very differently structured than natural language sentences or images.
arXiv Detail & Related papers (2021-10-19T19:45:05Z) - ProcK: Machine Learning for Knowledge-Intensive Processes [30.371382331613532]
ProcK (Process & Knowledge) is a novel pipeline to build business process prediction models.
Components to extract inter-linked event logs and knowledge bases from relational databases are part of the pipeline.
We demonstrate the power of ProcK by training it for prediction tasks on the OULAD e-learning dataset.
arXiv Detail & Related papers (2021-09-10T13:51:59Z) - Multivariate Business Process Representation Learning utilizing Gramian
Angular Fields and Convolutional Neural Networks [0.0]
Learning meaningful representations of data is an important aspect of machine learning.
For predictive process analytics, it is essential to have all explanatory characteristics of a process instance available.
We propose a novel approach for representation learning of business process instances.
arXiv Detail & Related papers (2021-06-15T10:21:14Z) - Data-driven modelling and characterisation of task completion sequences
in online courses [0.0]
We show how data-driven analysis of temporal sequences of task completion in online courses can be used.
We identify critical junctures and differences among types of tasks within the course design.
We find that non-rote learning tasks, such as interactive tasks or discussion posts, are correlated with higher performance.
arXiv Detail & Related papers (2020-07-14T12:39:03Z) - Analyzing Student Strategies In Blended Courses Using Clickstream Data [32.81171098036632]
We use pattern mining and models borrowed from Natural Language Processing to understand student interactions.
Fine-grained clickstream data is collected through Diderot, a non-commercial educational support system.
Our results suggest that the proposed hybrid NLP methods can provide valuable insights even in the low-data setting of blended courses.
arXiv Detail & Related papers (2020-05-31T03:01:00Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.