Building Interpretable Models for Business Process Prediction using
Shared and Specialised Attention Mechanisms
- URL: http://arxiv.org/abs/2109.01419v1
- Date: Fri, 3 Sep 2021 10:17:05 GMT
- Title: Building Interpretable Models for Business Process Prediction using
Shared and Specialised Attention Mechanisms
- Authors: Bemali Wickramanayake, Zhipeng He, Chun Ouyang, Catarina Moreira, Yue
Xu, Renuka Sindhgatta
- Abstract summary: We address the "black-box" problem in predictive process analytics by building interpretable models.
We propose two types of attentions: event attention to capture the impact of specific process events on a prediction, and attribute attention to reveal which attribute(s) of an event influenced the prediction.
- Score: 5.607831842909669
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the "black-box" problem in predictive process
analytics by building interpretable models that are capable to inform both what
and why is a prediction. Predictive process analytics is a newly emerged
discipline dedicated to providing business process intelligence in modern
organisations. It uses event logs, which capture process execution traces in
the form of multi-dimensional sequence data, as the key input to train
predictive models. These predictive models, often built upon deep learning
techniques, can be used to make predictions about the future states of business
process execution. We apply attention mechanism to achieve model
interpretability. We propose i) two types of attentions: event attention to
capture the impact of specific process events on a prediction, and attribute
attention to reveal which attribute(s) of an event influenced the prediction;
and ii) two attention mechanisms: shared attention mechanism and specialised
attention mechanism to reflect different design decisions in when to construct
attribute attention on individual input features (specialised) or using the
concatenated feature tensor of all input feature vectors (shared). These lead
to two distinct attention-based models, and both are interpretable models that
incorporate interpretability directly into the structure of a process
predictive model. We conduct experimental evaluation of the proposed models
using real-life dataset, and comparative analysis between the models for
accuracy and interpretability, and draw insights from the evaluation and
analysis results.
Related papers
- Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Attention Please: What Transformer Models Really Learn for Process Prediction [0.0]
This paper examines whether the attention scores of a transformer based next-activity prediction model can serve as an explanation for its decision-making.
We find that attention scores in next-activity prediction models can serve as explainers and exploit this fact in two proposed graph-based explanation approaches.
arXiv Detail & Related papers (2024-08-12T08:20:38Z) - Predictive Churn with the Set of Good Models [64.05949860750235]
We study the effect of conflicting predictions over the set of near-optimal machine learning models.
We present theoretical results on the expected churn between models within the Rashomon set.
We show how our approach can be used to better anticipate, reduce, and avoid churn in consumer-facing applications.
arXiv Detail & Related papers (2024-02-12T16:15:25Z) - Causal Analysis for Robust Interpretability of Neural Networks [0.2519906683279152]
We develop a robust interventional-based method to capture cause-effect mechanisms in pre-trained neural networks.
We apply our method to vision models trained on classification tasks.
arXiv Detail & Related papers (2023-05-15T18:37:24Z) - Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels.
Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features.
These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z) - Joint Forecasting of Panoptic Segmentations with Difference Attention [72.03470153917189]
We study a new panoptic segmentation forecasting model that jointly forecasts all object instances in a scene.
We evaluate the proposed model on the Cityscapes and AIODrive datasets.
arXiv Detail & Related papers (2022-04-14T17:59:32Z) - Explainability in Process Outcome Prediction: Guidelines to Obtain
Interpretable and Faithful Models [77.34726150561087]
We define explainability through the interpretability of the explanations and the faithfulness of the explainability model in the field of process outcome prediction.
This paper contributes a set of guidelines named X-MOP which allows selecting the appropriate model based on the event log specifications.
arXiv Detail & Related papers (2022-03-30T05:59:50Z) - Explainable AI Enabled Inspection of Business Process Prediction Models [2.5229940062544496]
We present an approach that allows us to use model explanations to investigate certain reasoning applied by machine learned predictions.
A novel contribution of our approach is the proposal of model inspection that leverages both the explanations generated by interpretable machine learning mechanisms and the contextual or domain knowledge extracted from event logs that record historical process execution.
arXiv Detail & Related papers (2021-07-16T06:51:18Z) - Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z) - Introduction to Rare-Event Predictive Modeling for Inferential
Statisticians -- A Hands-On Application in the Prediction of Breakthrough
Patents [0.0]
We introduce a machine learning (ML) approach to quantitative analysis geared towards optimizing the predictive performance.
We discuss the potential synergies between the two fields against the backdrop of this, at first glance, target-incompatibility.
We are providing a hands-on predictive modeling introduction for a quantitative social science audience while aiming at demystifying computer science jargon.
arXiv Detail & Related papers (2020-03-30T13:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.