Stochastic Directly-Follows Process Discovery Using Grammatical Inference
- URL: http://arxiv.org/abs/2312.05433v2
- Date: Tue, 16 Apr 2024 07:29:16 GMT
- Title: Stochastic Directly-Follows Process Discovery Using Grammatical Inference
- Authors: Hanan Alkhammash, Artem Polyvyanyy, Alistair Moffat,
- Abstract summary: We propose a new approach for discovering sound Directly-Follows Graphs that is grounded in grammatical inference over the input traces.
Experiments over real-world datasets confirm that our new approach can construct smaller models that represent the input traces and their frequencies more accurately than the state-of-the-art technique.
- Score: 8.196011179587304
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Starting with a collection of traces generated by process executions, process discovery is the task of constructing a simple model that describes the process, where simplicity is often measured in terms of model size. The challenge of process discovery is that the process of interest is unknown, and that while the input traces constitute positive examples of process executions, no negative examples are available. Many commercial tools discover Directly-Follows Graphs, in which nodes represent the observable actions of the process, and directed arcs indicate execution order possibilities over the actions. We propose a new approach for discovering sound Directly-Follows Graphs that is grounded in grammatical inference over the input traces. To promote the discovery of small graphs that also describe the process accurately we design and evaluate a genetic algorithm that supports the convergence of the inference parameters to the areas that lead to the discovery of interesting models. Experiments over real-world datasets confirm that our new approach can construct smaller models that represent the input traces and their frequencies more accurately than the state-of-the-art technique. Reasoning over the frequencies of encoded traces also becomes possible, due to the stochastic semantics of the action graphs we propose, which, for the first time, are interpreted as models that describe the stochastic languages of action traces.
Related papers
- Mining a Minimal Set of Behavioral Patterns using Incremental Evaluation [3.16536213610547]
Existing approaches to behavioral pattern mining suffer from two limitations.
First, they show limited scalability as incremental computation is incorporated only in the generation of pattern candidates.
Second, process analysis based on mined patterns shows limited effectiveness due to an overwhelmingly large number of patterns obtained in practical application scenarios.
arXiv Detail & Related papers (2024-02-05T11:41:37Z) - Learning minimal representations of stochastic processes with
variational autoencoders [52.99137594502433]
We introduce an unsupervised machine learning approach to determine the minimal set of parameters required to describe a process.
Our approach enables for the autonomous discovery of unknown parameters describing processes.
arXiv Detail & Related papers (2023-07-21T14:25:06Z) - Similarity-aware Positive Instance Sampling for Graph Contrastive
Pre-training [82.68805025636165]
We propose to select positive graph instances directly from existing graphs in the training set.
Our selection is based on certain domain-specific pair-wise similarity measurements.
Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph.
arXiv Detail & Related papers (2022-06-23T20:12:51Z) - Score matching enables causal discovery of nonlinear additive noise
models [63.93669924730725]
We show how to design a new generation of scalable causal discovery methods.
We propose a new efficient method for approximating the score's Jacobian, enabling to recover the causal graph.
arXiv Detail & Related papers (2022-03-08T21:34:46Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Process Discovery Using Graph Neural Networks [2.6381163133447836]
We introduce a technique for training an ML-based model D using graphal neural networks.
D translates a given input event log into a sound Petri net.
We show that training D on synthetically generated pairs of input logs and output models allows D to translate previously unseen synthetic and several real-life event logs into sound.
arXiv Detail & Related papers (2021-09-13T10:04:34Z) - On Contrastive Representations of Stochastic Processes [53.21653429290478]
Learning representations of processes is an emerging problem in machine learning.
We show that our methods are effective for learning representations of periodic functions, 3D objects and dynamical processes.
arXiv Detail & Related papers (2021-06-18T11:00:24Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z) - An Entropic Relevance Measure for Stochastic Conformance Checking in
Process Mining [9.302180124254338]
We present an entropic relevance measure for conformance checking, computed as the average number of bits required to compress each of the log's traces.
We show that entropic relevance is computable in time linear in the size of the log, and provide evaluation outcomes that demonstrate the feasibility of using the new approach in industrial settings.
arXiv Detail & Related papers (2020-07-18T02:25:33Z) - Adversarial System Variant Approximation to Quantify Process Model
Generalization [2.538209532048867]
In process mining, process models are extracted from event logs and are commonly assessed using multiple quality dimensions.
A novel deep learning-based methodology called Adversarial System Variant Approximation (AVATAR) is proposed to overcome this issue.
arXiv Detail & Related papers (2020-03-26T22:06:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.