Subtask Analysis of Process Data Through a Predictive Model
- URL: http://arxiv.org/abs/2009.00717v1
- Date: Sat, 29 Aug 2020 21:11:01 GMT
- Title: Subtask Analysis of Process Data Through a Predictive Model
- Authors: Zhi Wang, Xueying Tang, Jingchen Liu and Zhiliang Ying
- Abstract summary: This paper develops a computationally efficient method for exploratory analysis of such process data.
The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction.
We use the process data from PIAAC 2012 to demonstrate how exploratory analysis of process data can be done with the new approach.
- Score: 5.7668512557707166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Response process data collected from human-computer interactive items contain
rich information about respondents' behavioral patterns and cognitive
processes. Their irregular formats as well as their large sizes make standard
statistical tools difficult to apply. This paper develops a computationally
efficient method for exploratory analysis of such process data. The new
approach segments a lengthy individual process into a sequence of short
subprocesses to achieve complexity reduction, easy clustering and meaningful
interpretation. Each subprocess is considered a subtask. The segmentation is
based on sequential action predictability using a parsimonious predictive model
combined with the Shannon entropy. Simulation studies are conducted to assess
performance of the new methods. We use the process data from PIAAC 2012 to
demonstrate how exploratory analysis of process data can be done with the new
approach.
Related papers
- Distilled Datamodel with Reverse Gradient Matching [74.75248610868685]
We introduce an efficient framework for assessing data impact, comprising offline training and online evaluation stages.
Our proposed method achieves comparable model behavior evaluation while significantly speeding up the process compared to the direct retraining method.
arXiv Detail & Related papers (2024-04-22T09:16:14Z) - Mining a Minimal Set of Behavioral Patterns using Incremental Evaluation [3.16536213610547]
Existing approaches to behavioral pattern mining suffer from two limitations.
First, they show limited scalability as incremental computation is incorporated only in the generation of pattern candidates.
Second, process analysis based on mined patterns shows limited effectiveness due to an overwhelmingly large number of patterns obtained in practical application scenarios.
arXiv Detail & Related papers (2024-02-05T11:41:37Z) - A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions.
This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z) - ALMERIA: Boosting pairwise molecular contrasts with scalable methods [0.0]
ALMERIA is a tool for estimating compound similarities and activity prediction based on pairwise molecular contrasts.
It has been implemented using scalable software and methods to exploit large volumes of data.
Experiments show state-of-the-art performance for molecular activity prediction.
arXiv Detail & Related papers (2023-04-28T16:27:06Z) - Clustering Object-Centric Event Logs [0.36748639131154304]
We propose a clustering-based approach to cluster similar objects in OCELs to simplify the obtained process models.
Our approach reduces the complexity of the process models and generates coherent subsets of objects which help the end-users gain insights into the process.
arXiv Detail & Related papers (2022-07-26T09:16:39Z) - Process-BERT: A Framework for Representation Learning on Educational
Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z) - What Averages Do Not Tell -- Predicting Real Life Processes with
Sequential Deep Learning [0.1376408511310322]
Process Mining concerns discovering insights on business processes from their execution data that are logged by systems.
Many Deep Learning techniques have been successfully adapted for predictive Process Mining that aims to predict process outcomes.
Traces in Process Mining are multimodal sequences and very differently structured than natural language sentences or images.
arXiv Detail & Related papers (2021-10-19T19:45:05Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Rissanen Data Analysis: Examining Dataset Characteristics via
Description Length [78.42578316883271]
We introduce a method to determine if a certain capability helps to achieve an accurate model of given data.
Since minimum program length is uncomputable, we estimate the labels' minimum description length (MDL) as a proxy.
We call the method Rissanen Data Analysis (RDA) after the father of MDL.
arXiv Detail & Related papers (2021-03-05T18:58:32Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z) - ProcData: An R Package for Process Data Analysis [5.278929511653198]
R package ProcData presented in this article is designed to provide tools for processing, describing, and analyzing process data.
Two feature extraction methods for process data are implemented in the package for compressing information in the irregular response processes into regular numeric vectors.
In addition, several response process generators and a real dataset of response processes of the climate control item in the 2012 Programme for International Student Assessment are included in the package.
arXiv Detail & Related papers (2020-06-09T05:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.