Efficient Conformance Checking using Approximate Alignment Computation
with Tandem Repeats
- URL: http://arxiv.org/abs/2004.01781v2
- Date: Sat, 26 Mar 2022 13:28:47 GMT
- Title: Efficient Conformance Checking using Approximate Alignment Computation
with Tandem Repeats
- Authors: Daniel Rei{\ss}ner, Abel Armas-Cervantes, Marcello La Rosa
- Abstract summary: Conformance checking aims to find and describe the differences between a process model capturing the expected process behavior and a corresponding event log recording the observed behavior.
Alignments are an established technique to compute the distance between a trace in the event log and the closest execution trace of a corresponding process model.
We propose a novel approximate technique that uses pre- and post-processing steps to compress the length of a trace and recompute the alignment cost.
- Score: 0.03222802562733786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conformance checking encompasses a body of process mining techniques which
aim to find and describe the differences between a process model capturing the
expected process behavior and a corresponding event log recording the observed
behavior. Alignments are an established technique to compute the distance
between a trace in the event log and the closest execution trace of a
corresponding process model. Given a cost function, an alignment is optimal
when it contains the least number of mismatches between a log trace and a model
trace. Determining optimal alignments, however, is computationally expensive,
especially in light of the growing size and complexity of event logs from
practice, which can easily exceed one million events with traces of several
hundred activities. A common limitation of existing alignment techniques is the
inability to exploit repetitions in the log. By exploiting a specific form of
sequential pattern in traces, namely tandem repeats, we propose a novel
approximate technique that uses pre- and post-processing steps to compress the
length of a trace and recomputes the alignment cost while guaranteeing that the
cost result never under-approximates the optimal cost. In an extensive
empirical evaluation with 50 real-life model log pairs and against six
state-of-the-art alignment techniques, we show that the proposed compression
approach systematically outperforms the baselines by up to an order of
magnitude in the presence of traces with repetitions, and that the cost
over-approximation, when it occurs, is negligible.
Related papers
- Efficient Online Computation of Business Process State From Trace Prefixes via N-Gram Indexing [0.5141137421503899]
Given a process model and an event log containing trace prefixes of ongoing cases of a process, map each case to its corresponding state in the model.
An approach to this state computation problem is to perform a token-based replay of each trace prefix against the model.
This paper proposes a method that, given a trace prefix of an ongoing case, computes its state in constant time using an index that represents states as n-grams.
arXiv Detail & Related papers (2024-09-09T14:28:24Z) - A Scalable and Near-Optimal Conformance Checking Approach for Long Traces [3.3170150440851485]
Conformity checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment.
This paper introduces a novel sliding window approach to address these scalability challenges.
By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space.
arXiv Detail & Related papers (2024-06-08T11:04:42Z) - Bisimulation Learning [55.859538562698496]
We compute finite bisimulations of state transition systems with large, possibly infinite state space.
Our technique yields faster verification results than alternative state-of-the-art tools in practice.
arXiv Detail & Related papers (2024-05-24T17:11:27Z) - Pipelined correlated minimum weight perfect matching of the surface code [56.01788646782563]
We describe a pipeline approach to decoding the surface code using minimum weight perfect matching.
An independent no-communication parallelizable processing stage reweights the graph according to likely correlations.
A later general stage finishes the matching.
We validate the new algorithm on the fully fault-tolerant toric, unrotated, and rotated surface codes.
arXiv Detail & Related papers (2022-05-19T19:58:02Z) - FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable.
We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z) - Conformance Checking Over Stochastically Known Logs [7.882975068446842]
Data logs may become uncertain due to, e.g., sensor reading inaccuracies or incorrect interpretation of readings by processing programs.
In this work we focus on conformance checking, which compares a process model with an event log.
We mathematically define a trace model, a synchronous product, and a cost function that reflects the uncertainty of events in a log.
arXiv Detail & Related papers (2022-03-14T21:33:06Z) - CoCoMoT: Conformance Checking of Multi-Perspective Processes via SMT
(Extended Version) [62.96267257163426]
We introduce the CoCoMoT (Computing Conformance Modulo Theories) framework.
First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case.
Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering.
arXiv Detail & Related papers (2021-03-18T20:22:50Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z) - An Entropic Relevance Measure for Stochastic Conformance Checking in
Process Mining [9.302180124254338]
We present an entropic relevance measure for conformance checking, computed as the average number of bits required to compress each of the log's traces.
We show that entropic relevance is computable in time linear in the size of the log, and provide evaluation outcomes that demonstrate the feasibility of using the new approach in industrial settings.
arXiv Detail & Related papers (2020-07-18T02:25:33Z) - Partial Order Resolution of Event Logs for Process Conformance Checking [10.58705988536919]
A key assumption of existing conformance checking techniques is that all events are associated with timestamps that allow to infer a total order of events per process instance.
We present several estimators for this task, incorporating different notions of behavioral abstraction.
Our experiments with real-world and synthetic data reveal that our approach improves accuracy over the state-of-the-art considerably.
arXiv Detail & Related papers (2020-07-05T18:43:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.