Related papers: Efficient Conformance Checking using Approximate Alignment Computation with Tandem Repeats

Efficient Conformance Checking using Approximate Alignment Computation with Tandem Repeats

URL: http://arxiv.org/abs/2004.01781v2
Date: Sat, 26 Mar 2022 13:28:47 GMT
Title: Efficient Conformance Checking using Approximate Alignment Computation with Tandem Repeats
Authors: Daniel Rei{\ss}ner, Abel Armas-Cervantes, Marcello La Rosa
Abstract summary: Conformance checking aims to find and describe the differences between a process model capturing the expected process behavior and a corresponding event log recording the observed behavior. Alignments are an established technique to compute the distance between a trace in the event log and the closest execution trace of a corresponding process model. We propose a novel approximate technique that uses pre- and post-processing steps to compress the length of a trace and recompute the alignment cost.
Score: 0.03222802562733786
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conformance checking encompasses a body of process mining techniques which aim to find and describe the differences between a process model capturing the expected process behavior and a corresponding event log recording the observed behavior. Alignments are an established technique to compute the distance between a trace in the event log and the closest execution trace of a corresponding process model. Given a cost function, an alignment is optimal when it contains the least number of mismatches between a log trace and a model trace. Determining optimal alignments, however, is computationally expensive, especially in light of the growing size and complexity of event logs from practice, which can easily exceed one million events with traces of several hundred activities. A common limitation of existing alignment techniques is the inability to exploit repetitions in the log. By exploiting a specific form of sequential pattern in traces, namely tandem repeats, we propose a novel approximate technique that uses pre- and post-processing steps to compress the length of a trace and recomputes the alignment cost while guaranteeing that the cost result never under-approximates the optimal cost. In an extensive empirical evaluation with 50 real-life model log pairs and against six state-of-the-art alignment techniques, we show that the proposed compression approach systematically outperforms the baselines by up to an order of magnitude in the presence of traces with repetitions, and that the cost over-approximation, when it occurs, is negligible.

Related papers

FoldA: Computing Partial-Order Alignments Using Directed Net Unfoldings [0.6906005491572401]
This paper proposes a new technique for computing partial-order alignments on the fly using directed Petri net unfoldings, named FoldA.<n>We evaluate our technique on 485 synthetic model-log pairs and compare it against Astar- and Dijkstra-alignments on 13 real-life model-log pairs and 6 benchmark pairs.
arXiv Detail & Related papers (2025-06-10T09:44:05Z)
Fractured Chain-of-Thought Reasoning [61.647243580650446]
We introduce Fractured Sampling, a unified inference-time strategy that interpolates between full CoT and solution-only sampling.<n>We show that Fractured Sampling consistently achieves superior accuracy-cost trade-offs, yielding steep log-linear scaling gains in Pass@k versus token budget.
arXiv Detail & Related papers (2025-05-19T11:30:41Z)
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive. LCD can distort the global distribution over strings, sampling tokens based only on local information. We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences [3.3170150440851485]
ConLES is a sliding-window conformance checking approach for long event sequences.<n>It partitions traces into manageable subtraces and aligns each against the expected behavior.<n>We use global information that captures structural properties of both the trace and the process model.
arXiv Detail & Related papers (2025-03-09T16:42:59Z)
Efficient Online Computation of Business Process State From Trace Prefixes via N-Gram Indexing [0.5141137421503899]
Given a process model and an event log containing trace prefixes of ongoing cases of a process, map each case to its corresponding state in the model. An approach to this state computation problem is to perform a token-based replay of each trace prefix against the model. This paper proposes a method that, given a trace prefix of an ongoing case, computes its state in constant time using an index that represents states as n-grams.
arXiv Detail & Related papers (2024-09-09T14:28:24Z)
A Scalable and Near-Optimal Conformance Checking Approach for Long Traces [3.3170150440851485]
Conformity checking, a key task in process mining, can become computationally infeasible due to the exponential complexity of finding an optimal alignment. This paper introduces a novel sliding window approach to address these scalability challenges. By breaking down traces into manageable subtraces and iteratively aligning each with the process model, our method significantly reduces the search space.
arXiv Detail & Related papers (2024-06-08T11:04:42Z)
Bisimulation Learning [55.859538562698496]
We compute finite bisimulations of state transition systems with large, possibly infinite state space. Our technique yields faster verification results than alternative state-of-the-art tools in practice.
arXiv Detail & Related papers (2024-05-24T17:11:27Z)
An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning [55.467047686093025]
A common approach to alleviate such forgetting is to rehearse samples from prior tasks during fine-tuning. We propose a sampling scheme, textttbf mix-cd, that prioritizes rehearsal of collateral damage'' samples. Our approach is computationally efficient, easy to implement, and outperforms several leading continual learning methods in compute-constrained settings.
arXiv Detail & Related papers (2024-02-12T22:32:12Z)
Pipelined correlated minimum weight perfect matching of the surface code [56.01788646782563]
We describe a pipeline approach to decoding the surface code using minimum weight perfect matching. An independent no-communication parallelizable processing stage reweights the graph according to likely correlations. A later general stage finishes the matching. We validate the new algorithm on the fully fault-tolerant toric, unrotated, and rotated surface codes.
arXiv Detail & Related papers (2022-05-19T19:58:02Z)
FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable. We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z)
Conformance Checking Over Stochastically Known Logs [7.882975068446842]
Data logs may become uncertain due to, e.g., sensor reading inaccuracies or incorrect interpretation of readings by processing programs. In this work we focus on conformance checking, which compares a process model with an event log. We mathematically define a trace model, a synchronous product, and a cost function that reflects the uncertainty of events in a log.
arXiv Detail & Related papers (2022-03-14T21:33:06Z)
CoCoMoT: Conformance Checking of Multi-Perspective Processes via SMT (Extended Version) [62.96267257163426]
We introduce the CoCoMoT (Computing Conformance Modulo Theories) framework. First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case. Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering.
arXiv Detail & Related papers (2021-03-18T20:22:50Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data. In this paper, we propose to use (block-) structured programs directly as target process models. We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z)
An Entropic Relevance Measure for Stochastic Conformance Checking in Process Mining [9.302180124254338]
We present an entropic relevance measure for conformance checking, computed as the average number of bits required to compress each of the log's traces. We show that entropic relevance is computable in time linear in the size of the log, and provide evaluation outcomes that demonstrate the feasibility of using the new approach in industrial settings.
arXiv Detail & Related papers (2020-07-18T02:25:33Z)
Partial Order Resolution of Event Logs for Process Conformance Checking [10.58705988536919]
A key assumption of existing conformance checking techniques is that all events are associated with timestamps that allow to infer a total order of events per process instance. We present several estimators for this task, incorporating different notions of behavioral abstraction. Our experiments with real-world and synthetic data reveal that our approach improves accuracy over the state-of-the-art considerably.
arXiv Detail & Related papers (2020-07-05T18:43:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.