Self-Explaining Neural Networks for Business Process Monitoring
- URL: http://arxiv.org/abs/2503.18067v1
- Date: Sun, 23 Mar 2025 13:28:34 GMT
- Title: Self-Explaining Neural Networks for Business Process Monitoring
- Authors: Shahaf Bassan, Shlomit Gur, Sergey Zeltyn, Konstantinos Mavrogiorgos, Ron Eliav, Dimosthenis Kyriazis,
- Abstract summary: We introduce, to the best of our knowledge, the first *self-explaining neural network* architecture for predictive process monitoring.<n>Our framework trains an LSTM model that not only provides predictions but also outputs a concise explanation for each prediction.<n>We show that our method outperforms post-hoc approaches in terms of both the faithfulness of the generated explanations and substantial improvements in efficiency.
- Score: 2.8499886197917443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tasks in Predictive Business Process Monitoring (PBPM), such as Next Activity Prediction, focus on generating useful business predictions from historical case logs. Recently, Deep Learning methods, particularly sequence-to-sequence models like Long Short-Term Memory (LSTM), have become a dominant approach for tackling these tasks. However, to enhance model transparency, build trust in the predictions, and gain a deeper understanding of business processes, it is crucial to explain the decisions made by these models. Existing explainability methods for PBPM decisions are typically *post-hoc*, meaning they provide explanations only after the model has been trained. Unfortunately, these post-hoc approaches have shown to face various challenges, including lack of faithfulness, high computational costs and a significant sensitivity to out-of-distribution samples. In this work, we introduce, to the best of our knowledge, the first *self-explaining neural network* architecture for predictive process monitoring. Our framework trains an LSTM model that not only provides predictions but also outputs a concise explanation for each prediction, while adapting the optimization objective to improve the reliability of the explanation. We first demonstrate that incorporating explainability into the training process does not hurt model performance, and in some cases, actually improves it. Additionally, we show that our method outperforms post-hoc approaches in terms of both the faithfulness of the generated explanations and substantial improvements in efficiency.
Related papers
- Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models.
This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution.
We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z) - Generating Feasible and Plausible Counterfactual Explanations for Outcome Prediction of Business Processes [45.502284864662585]
We introduce a data-driven approach, REVISEDplus, to generate plausible counterfactual explanations.
First, we restrict the counterfactual algorithm to generate counterfactuals that lie within a high-density region of the process data.
We also ensure plausibility by learning sequential patterns between the activities in the process cases.
arXiv Detail & Related papers (2024-03-14T09:56:35Z) - Knowledge-Driven Modulation of Neural Networks with Attention Mechanism
for Next Activity Prediction [8.552757384215813]
We present a Symbolic[Neuro] system that leverages background knowledge expressed in terms of a procedural process model to offset the under-sampling in the training data.
More specifically, we make predictions using NNs with attention mechanism, an emerging technology in the NN field.
The system has been tested on several real-life logs showing an improvement in the performance of the prediction task.
arXiv Detail & Related papers (2023-12-14T12:02:35Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Explaining Language Models' Predictions with High-Impact Concepts [11.47612457613113]
We propose a complete framework for extending concept-based interpretability methods to NLP.
We optimize for features whose existence causes the output predictions to change substantially.
Our method achieves superior results on predictive impact, usability, and faithfulness compared to the baselines.
arXiv Detail & Related papers (2023-05-03T14:48:27Z) - Explainability in Process Outcome Prediction: Guidelines to Obtain
Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction.
This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Introduction to Rare-Event Predictive Modeling for Inferential
Statisticians -- A Hands-On Application in the Prediction of Breakthrough
Patents [0.0]
We introduce a machine learning (ML) approach to quantitative analysis geared towards optimizing the predictive performance.
We discuss the potential synergies between the two fields against the backdrop of this, at first glance, target-incompatibility.
We are providing a hands-on predictive modeling introduction for a quantitative social science audience while aiming at demystifying computer science jargon.
arXiv Detail & Related papers (2020-03-30T13:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.