Trace Encoding in Process Mining: a survey and benchmarking
- URL: http://arxiv.org/abs/2301.02167v1
- Date: Thu, 5 Jan 2023 17:25:30 GMT
- Title: Trace Encoding in Process Mining: a survey and benchmarking
- Authors: Sylvio Barbon Jr., Paolo Ceravolo, Rafael S. Oyamada, Gabriel M.
Tavares
- Abstract summary: Methods are employed across several process mining tasks, including predictive process monitoring, anomalous case detection, clustering trace, etc.
Most papers choose existing encoding methods arbitrarily or employ a strategy based on a specific expert knowledge domain.
This work aims at providing a comprehensive survey on event log encoding by comparing 27 methods.
- Score: 0.34410212782758054
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Encoding methods are employed across several process mining tasks, including
predictive process monitoring, anomalous case detection, trace clustering, etc.
These methods are usually performed as preprocessing steps and are responsible
for transforming complex information into a numerical feature space. Most
papers choose existing encoding methods arbitrarily or employ a strategy based
on a specific expert knowledge domain. Moreover, existing methods are employed
by using their default hyperparameters without evaluating other options. This
practice can lead to several drawbacks, such as suboptimal performance and
unfair comparisons with the state-of-the-art. Therefore, this work aims at
providing a comprehensive survey on event log encoding by comparing 27 methods,
from different natures, in terms of expressivity, scalability, correlation, and
domain agnosticism. To the best of our knowledge, this is the most
comprehensive study so far focusing on trace encoding in process mining. It
contributes to maturing awareness about the role of trace encoding in process
mining pipelines and sheds light on issues, concerns, and future research
directions regarding the use of encoding methods to bridge the gap between
machine learning models and process mining.
Related papers
- Boosting CNN-based Handwriting Recognition Systems with Learnable Relaxation Labeling [48.78361527873024]
We propose a novel approach to handwriting recognition that integrates the strengths of two distinct methodologies.
We introduce a sparsification technique that accelerates the convergence of the algorithm and enhances the overall system's performance.
arXiv Detail & Related papers (2024-09-09T15:12:28Z) - A Scalable and Near-Optimal Conformance Checking Approach for Long Traces [3.3170150440851485]
Conformity checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment.
This paper introduces a novel sliding window approach to address these scalability challenges.
By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space.
arXiv Detail & Related papers (2024-06-08T11:04:42Z) - Process Variant Analysis Across Continuous Features: A Novel Framework [0.0]
This research addresses the challenge of effectively segmenting cases within operational processes.
We present a novel approach employing a sliding window technique combined with the earth mover's distance to detect changes in control flow behavior.
We validate our methodology through a real-life case study in collaboration with UWV, the Dutch employee insurance agency.
arXiv Detail & Related papers (2024-05-06T16:10:13Z) - A Thorough Examination of Decoding Methods in the Era of LLMs [72.65956436513241]
Decoding methods play an indispensable role in converting language models from next-token predictors into practical task solvers.
This paper provides a comprehensive and multifaceted analysis of various decoding methods within the context of large language models.
Our findings reveal that decoding method performance is notably task-dependent and influenced by factors such as alignment, model size, and quantization.
arXiv Detail & Related papers (2024-02-10T11:14:53Z) - Discovering Hierarchical Process Models: an Approach Based on Events
Clustering [0.0]
We present an algorithm for discovering hierarchical process models represented as two-level workflow nets.
Unlike existing solutions, our algorithm does not impose restrictions on the process control flow and allows for iteration.
arXiv Detail & Related papers (2023-03-12T11:05:40Z) - Feature Recommendation for Structural Equation Model Discovery in
Process Mining [0.0]
We propose a method for finding the set of (aggregated) features with a possible effect on the problem.
We have implemented the proposed method as a plugin in ProM and we have evaluated it using two real and synthetic event logs.
arXiv Detail & Related papers (2021-08-13T12:23:01Z) - CoCoMoT: Conformance Checking of Multi-Perspective Processes via SMT
(Extended Version) [62.96267257163426]
We introduce the CoCoMoT (Computing Conformance Modulo Theories) framework.
First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case.
Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering.
arXiv Detail & Related papers (2021-03-18T20:22:50Z) - Process Comparison Using Object-Centric Process Cubes [69.68068088508505]
In real-life business processes, different behaviors exist that make the overall process too complex to interpret.
Process comparison is a branch of process mining that isolates different behaviors of the process from each other by using process cubes.
We propose a process cube framework, which supports process cube operations such as slice and dice on object-centric event logs.
arXiv Detail & Related papers (2021-03-12T10:08:28Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z) - A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens.
We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.