Subtask Analysis of Process Data Through a Predictive Model
- URL: http://arxiv.org/abs/2009.00717v1
- Date: Sat, 29 Aug 2020 21:11:01 GMT
- Title: Subtask Analysis of Process Data Through a Predictive Model
- Authors: Zhi Wang, Xueying Tang, Jingchen Liu and Zhiliang Ying
- Abstract summary: This paper develops a computationally efficient method for exploratory analysis of such process data.
The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction.
We use the process data from PIAAC 2012 to demonstrate how exploratory analysis of process data can be done with the new approach.
- Score: 5.7668512557707166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Response process data collected from human-computer interactive items contain
rich information about respondents' behavioral patterns and cognitive
processes. Their irregular formats as well as their large sizes make standard
statistical tools difficult to apply. This paper develops a computationally
efficient method for exploratory analysis of such process data. The new
approach segments a lengthy individual process into a sequence of short
subprocesses to achieve complexity reduction, easy clustering and meaningful
interpretation. Each subprocess is considered a subtask. The segmentation is
based on sequential action predictability using a parsimonious predictive model
combined with the Shannon entropy. Simulation studies are conducted to assess
performance of the new methods. We use the process data from PIAAC 2012 to
demonstrate how exploratory analysis of process data can be done with the new
approach.
Related papers
- Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction [1.3563640142303988]
Large language models (LLMs) can process lengthy documents even without supervised training on a task-specific dataset.
One feasible approach for tasks with lengthy, complex input is to first summarize the document and then apply supervised fine-tuning to the summary.
We present a method for processing the summaries of long documents aimed to capture different important aspects of the original document.
arXiv Detail & Related papers (2025-02-14T18:59:28Z) - Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.
We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.
We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling.
We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process.
Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z) - Mining a Minimal Set of Behavioral Patterns using Incremental Evaluation [3.16536213610547]
Existing approaches to behavioral pattern mining suffer from two limitations.
First, they show limited scalability as incremental computation is incorporated only in the generation of pattern candidates.
Second, process analysis based on mined patterns shows limited effectiveness due to an overwhelmingly large number of patterns obtained in practical application scenarios.
arXiv Detail & Related papers (2024-02-05T11:41:37Z) - A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions.
This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z) - Clustering Object-Centric Event Logs [0.36748639131154304]
We propose a clustering-based approach to cluster similar objects in OCELs to simplify the obtained process models.
Our approach reduces the complexity of the process models and generates coherent subsets of objects which help the end-users gain insights into the process.
arXiv Detail & Related papers (2022-07-26T09:16:39Z) - Process-BERT: A Framework for Representation Learning on Educational
Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z) - What Averages Do Not Tell -- Predicting Real Life Processes with
Sequential Deep Learning [0.1376408511310322]
Process Mining concerns discovering insights on business processes from their execution data that are logged by systems.
Many Deep Learning techniques have been successfully adapted for predictive Process Mining that aims to predict process outcomes.
Traces in Process Mining are multimodal sequences and very differently structured than natural language sentences or images.
arXiv Detail & Related papers (2021-10-19T19:45:05Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z) - ProcData: An R Package for Process Data Analysis [5.278929511653198]
R package ProcData presented in this article is designed to provide tools for processing, describing, and analyzing process data.
Two feature extraction methods for process data are implemented in the package for compressing information in the irregular response processes into regular numeric vectors.
In addition, several response process generators and a real dataset of response processes of the climate control item in the 2012 Programme for International Student Assessment are included in the package.
arXiv Detail & Related papers (2020-06-09T05:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.