SHAining on Process Mining: Explaining Event Log Characteristics Impact on Algorithms
- URL: http://arxiv.org/abs/2509.08482v1
- Date: Wed, 10 Sep 2025 10:47:51 GMT
- Title: SHAining on Process Mining: Explaining Event Log Characteristics Impact on Algorithms
- Authors: Andrea Maldonado, Christian M. M. Frey, Sai Anirudh Aryasomayajula, Ludwig Zellner, Stephan A. Fahrenkrog-Petersen, Thomas Seidl,
- Abstract summary: We introduce SHAining, the first approach to quantify the marginal contribution of varying event log characteristics to process mining algorithms' metrics.<n>We analyze over 22,000 event logs covering a wide span of characteristics to uncover which affect algorithms across metrics (e.g., fitness, precision, complexity) the most.
- Score: 5.092742009996173
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Process mining aims to extract and analyze insights from event logs, yet algorithm metric results vary widely depending on structural event log characteristics. Existing work often evaluates algorithms on a fixed set of real-world event logs but lacks a systematic analysis of how event log characteristics impact algorithms individually. Moreover, since event logs are generated from processes, where characteristics co-occur, we focus on associational rather than causal effects to assess how strong the overlapping individual characteristic affects evaluation metrics without assuming isolated causal effects, a factor often neglected by prior work. We introduce SHAining, the first approach to quantify the marginal contribution of varying event log characteristics to process mining algorithms' metrics. Using process discovery as a downstream task, we analyze over 22,000 event logs covering a wide span of characteristics to uncover which affect algorithms across metrics (e.g., fitness, precision, complexity) the most. Furthermore, we offer novel insights about how the value of event log characteristics correlates with their contributed impact, assessing the algorithm's robustness.
Related papers
- A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers [0.9558392439655014]
CoLog is a framework that collaboratively encodes logs utilizing various modalities.<n>In detecting both point and collective anomalies, CoLog achieves a mean precision of 99.63%, a mean recall of 99.59%, and a mean F1 score of 99.61%.
arXiv Detail & Related papers (2025-12-29T11:18:34Z) - Explainable Verification of Hierarchical Workflows Mined from Event Logs with Shapley Values [0.0]
We translate mined process trees into logical specifications and analyze properties such as satisfiability, liveness, and safety with automated theorem provers.<n>This outlines a novel direction for explainable workflow analysis with direct relevance to software engineering practice, supporting compliance checks, process optimization, redundancy reduction, and the design of next-generation process mining tools.
arXiv Detail & Related papers (2025-12-10T11:57:08Z) - Ranking the Top-K Realizations of Stochastically Known Event Logs [0.0]
We implement an efficient algorithm to calculate a top-K realization ranking of an event log under event independence within O(Kn)
We show that a top-K ranking depends on the length of the input event log and the distribution of the probabilities.
arXiv Detail & Related papers (2024-09-30T08:53:09Z) - Log Summarisation for Defect Evolution Analysis [14.055261850785456]
We suggest an online semantic-based clustering approach to error logs.
We also introduce a novel metric to evaluate the performance of temporal log clusters.
arXiv Detail & Related papers (2024-03-13T09:18:46Z) - ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization [52.5587113539404]
We introduce a causality-aware entropy term that effectively identifies and prioritizes actions with high potential impacts for efficient exploration.
Our proposed algorithm, ACE: Off-policy Actor-critic with Causality-aware Entropy regularization, demonstrates a substantial performance advantage across 29 diverse continuous control tasks.
arXiv Detail & Related papers (2024-02-22T13:22:06Z) - Detecting Anomalous Events in Object-centric Business Processes via
Graph Neural Networks [55.583478485027]
This study proposes a novel framework for anomaly detection in business processes.
We first reconstruct the process dependencies of the object-centric event logs as attributed graphs.
We then employ a graph convolutional autoencoder architecture to detect anomalous events.
arXiv Detail & Related papers (2024-02-14T14:17:56Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - VAC2: Visual Analysis of Combined Causality in Event Sequences [6.145427901944597]
We develop a combined causality visual analysis system to help users explore combined causes as well as an individual cause.
This interactive system supports multi-level causality exploration with diverse ordering strategies and a focus and context technique.
The usefulness and effectiveness of the system are further evaluated by conducting a pilot user study and two case studies on event sequence data.
arXiv Detail & Related papers (2022-06-11T04:53:23Z) - PROVED: A Tool for Graph Representation and Analysis of Uncertain Event
Data [0.966840768820136]
The discipline of process mining aims to study processes in a data-driven manner by analyzing historical process executions.
Recent novel types of event data have gathered interest among the process mining community, including uncertain event data.
The PROVED tool helps to explore, navigate and analyze such uncertain event data.
arXiv Detail & Related papers (2021-03-09T17:11:54Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.