Bootstrapping Generalization of Process Models Discovered From Event
Data
- URL: http://arxiv.org/abs/2107.03876v1
- Date: Thu, 8 Jul 2021 14:35:56 GMT
- Title: Bootstrapping Generalization of Process Models Discovered From Event
Data
- Authors: Artem Polyvyanyy, Alistair Moffat, Luciano Garc\'ia-Ba\~nuelos
- Abstract summary: Generalization seeks to quantify how well a discovered model describes future executions of the system.
We employ a bootstrap approach to estimate properties of a population based on a sample.
Experiments demonstrate the feasibility of the approach in industrial settings.
- Score: 10.574698833115589
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Process mining studies ways to derive value from process executions recorded
in event logs of IT-systems, with process discovery the task of inferring a
process model for an event log emitted by some unknown system. One quality
criterion for discovered process models is generalization. Generalization seeks
to quantify how well the discovered model describes future executions of the
system, and is perhaps the least understood quality criterion in process
mining. The lack of understanding is primarily a consequence of generalization
seeking to measure properties over the entire future behavior of the system,
when the only available sample of behavior is that provided by the event log
itself. In this paper, we draw inspiration from computational statistics, and
employ a bootstrap approach to estimate properties of a population based on a
sample. Specifically, we define an estimator of the model's generalization
based on the event log it was discovered from, and then use bootstrapping to
measure the generalization of the model with respect to the system, and its
statistical significance. Experiments demonstrate the feasibility of the
approach in industrial settings.
Related papers
- Discovery and Simulation of Data-Aware Business Processes [0.28675177318965045]
This paper introduces a data-aware BPS modeling approach and a method to discover data-aware BPS models from event logs.
The resulting BPS models more closely replicate the process execution control flow relative to data-unaware BPS models.
arXiv Detail & Related papers (2024-08-24T20:13:00Z) - Detecting Anomalous Events in Object-centric Business Processes via
Graph Neural Networks [55.583478485027]
This study proposes a novel framework for anomaly detection in business processes.
We first reconstruct the process dependencies of the object-centric event logs as attributed graphs.
We then employ a graph convolutional autoencoder architecture to detect anomalous events.
arXiv Detail & Related papers (2024-02-14T14:17:56Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization [0.0]
We introduce a novel mathematically sound method that integrates theoretical process models with interrelated minimal Hidden Markov Models.
Our method consolidates: (a) theoretical process models, (b) HMMs, (c) coupled nonnegative matrix-tensor factorizations, and (d) custom model selection.
arXiv Detail & Related papers (2022-10-03T16:19:27Z) - Accessing and Interpreting OPC UA Event Traces based on Semantic Process
Descriptions [69.9674326582747]
This paper proposes an approach to access a production systems' event data based on the event data's context.
The approach extracts filtered event logs from a database system by combining: 1) a semantic model of a production system's hierarchical structure, 2) a formalized process description and 3) an OPC UA information model.
arXiv Detail & Related papers (2022-07-25T15:13:44Z) - Generalization in Automated Process Discovery: A Framework based on
Event Log Patterns [0.03222802562733786]
Existing generalization measures exhibit several shortcomings that severely hinder their applicability in practice.
We propose a framework that generalizes a set of patterns discovered from an event log with representative traces.
We show that our measure can be efficiently computed for datasets two orders of magnitude larger than the largest dataset the baseline generalization measures can handle.
arXiv Detail & Related papers (2022-03-26T13:49:11Z) - On the Performance Analysis of the Adversarial System Variant
Approximation Method to Quantify Process Model Generalization [0.0]
This paper experimentally investigates the performance of Adversarial System Variant Approximation under non-ideal conditions.
The results confirm the need to raise awareness about the working conditions of the method.
arXiv Detail & Related papers (2021-07-13T18:27:09Z) - How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating
and Auditing Generative Models [95.8037674226622]
We introduce a 3-dimensional evaluation metric that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion.
Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity.
arXiv Detail & Related papers (2021-02-17T18:25:30Z) - Adversarial System Variant Approximation to Quantify Process Model
Generalization [2.538209532048867]
In process mining, process models are extracted from event logs and are commonly assessed using multiple quality dimensions.
A novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue.
arXiv Detail & Related papers (2020-03-26T22:06:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.