Related papers: Efficient Conformance Checking of Rich Data-Aware Declare Specifications (Extended)

Efficient Conformance Checking of Rich Data-Aware Declare Specifications (Extended)

URL: http://arxiv.org/abs/2507.00094v1
Date: Mon, 30 Jun 2025 10:16:21 GMT
Title: Efficient Conformance Checking of Rich Data-Aware Declare Specifications (Extended)
Authors: Jacobo Casas-Ramos, Sarah Winkler, Alessandro Gianola, Marco Montali, Manuel Mucientes, Manuel Lama,
Abstract summary: We show that it is possible to compute data-aware optimal alignments in a rich setting with general data types and data conditions.<n>This is achieved by carefully combining the two best-known approaches to deal with control flow and data dependencies.
Score: 49.46686813437884
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Despite growing interest in process analysis and mining for data-aware specifications, alignment-based conformance checking for declarative process models has focused on pure control-flow specifications, or mild data-aware extensions limited to numerical data and variable-to-constant comparisons. This is not surprising: finding alignments is computationally hard, even more so in the presence of data dependencies. In this paper, we challenge this problem in the case where the reference model is captured using data-aware Declare with general data types and data conditions. We show that, unexpectedly, it is possible to compute data-aware optimal alignments in this rich setting, enjoying at once efficiency and expressiveness. This is achieved by carefully combining the two best-known approaches to deal with control flow and data dependencies when computing alignments, namely A* search and SMT solving. Specifically, we introduce a novel algorithmic technique that efficiently explores the search space, generating descendant states through the application of repair actions aiming at incrementally resolving constraint violations. We prove the correctness of our algorithm and experimentally show its efficiency. The evaluation witnesses that our approach matches or surpasses the performance of the state of the art while also supporting significantly more expressive data dependencies, showcasing its potential to support real-world applications.

Related papers

Conformance Checking for Less: Efficient Conformance Checking for Long Event Sequences [3.3170150440851485]
ConLES is a sliding-window conformance checking approach for long event sequences.<n>It partitions traces into manageable subtraces and aligns each against the expected behavior.<n>We use global information that captures structural properties of both the trace and the process model.
arXiv Detail & Related papers (2025-03-09T16:42:59Z)
Conformal Validity Guarantees Exist for Any Data Distribution (and How to Find Them) [14.396431159723297]
We show that conformal prediction can theoretically be extended to textitany joint data distribution. Although the most general case is exceedingly impractical to compute, for concrete practical applications we outline a procedure for deriving specific conformal algorithms.
arXiv Detail & Related papers (2024-05-10T17:40:24Z)
$\ exttt{causalAssembly}$: Generating Realistic Production Data for Benchmarking Causal Discovery [1.3048920509133808]
We build a system for generation of semisynthetic manufacturing data that supports benchmarking of causal discovery methods. We employ distributional random forests to flexibly estimate and represent conditional distributions. Using the library, we showcase how to benchmark several well-known causal discovery algorithms.
arXiv Detail & Related papers (2023-06-19T10:05:54Z)
LAVA: Data Valuation without Pre-Specified Learning Algorithms [20.578106028270607]
We introduce a new framework that can value training data in a way that is oblivious to the downstream learning algorithm. We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets. We show that the distance characterizes the upper bound of the validation performance for any given model under certain Lipschitz conditions.
arXiv Detail & Related papers (2023-04-28T19:05:16Z)
Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks. Data augmentation induces these symmetries during training by applying multiple transformations to the input data. This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z)
Soundness of Data-Aware Processes with Arithmetic Conditions [8.914271888521652]
Data Petri nets (DPNs) have gained increasing popularity thanks to their ability to balance simplicity with expressiveness. The interplay of data and control-flow makes checking the correctness of such models, specifically the well-known property of soundness, crucial and challenging. We provide a framework for assessing soundness of DPNs enriched with arithmetic data conditions.
arXiv Detail & Related papers (2022-03-28T14:46:10Z)
DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)
Correlation-wise Smoothing: Lightweight Knowledge Extraction for HPC Monitoring Data [1.802439717192088]
We propose a novel method, called Correlation-wise Smoothing (CS), to extract descriptive signatures from time-series monitoring data. Our method exploits correlations between data dimensions to form groups and produces image-like signatures that can be easily manipulated, visualized and compared. We evaluate the CS method on HPC-ODA, a collection of datasets that we release with this work, and show that it leads to the same performance as most state-of-the-art methods.
arXiv Detail & Related papers (2020-10-13T05:22:47Z)
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting. We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance. We formulate a quality measure for the data set, which we refer to as $rho$-gap. We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
New advances in enumerative biclustering algorithms with online partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets. The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.