A Distance Measure for Privacy-preserving Process Mining based on
Feature Learning
- URL: http://arxiv.org/abs/2107.06578v1
- Date: Wed, 14 Jul 2021 09:44:28 GMT
- Title: A Distance Measure for Privacy-preserving Process Mining based on
Feature Learning
- Authors: Fabian R\"osel, Stephan A. Fahrenkrog-Petersen, Han van der Aa,
Matthias Weidlich
- Abstract summary: We show how embeddings of events enable the definition of a distance measure for traces to guide event log anonymization.
Our experiments with real-world data indicate that anonymization using this measure, compared to a syntactic one, yields logs that are closer to the original log in various dimensions.
- Score: 5.250561515565923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To enable process analysis based on an event log without compromising the
privacy of individuals involved in process execution, a log may be anonymized.
Such anonymization strives to transform a log so that it satisfies provable
privacy guarantees, while largely maintaining its utility for process analysis.
Existing techniques perform anonymization using simple, syntactic measures to
identify suitable transformation operations. This way, the semantics of the
activities referenced by the events in a trace are neglected, potentially
leading to transformations in which events of unrelated activities are merged.
To avoid this and incorporate the semantics of activities during anonymization,
we propose to instead incorporate a distance measure based on feature learning.
Specifically, we show how embeddings of events enable the definition of a
distance measure for traces to guide event log anonymization. Our experiments
with real-world data indicate that anonymization using this measure, compared
to a syntactic one, yields logs that are closer to the original log in various
dimensions and, hence, have higher utility for process analysis.
Related papers
- Multiple Object Tracking as ID Prediction [14.890192237433771]
In Multiple Object Tracking (MOT), tracking-by-detection methods have stood the test for a long time.
They leverage single-frame detectors and treat object association as a post-processing step through hand-crafted algorithms and surrogate tasks.
However, the nature of techniques prevents end-to-end exploitation of training data, leading to increasingly cumbersome and challenging manual modification.
arXiv Detail & Related papers (2024-03-25T15:09:54Z) - Detecting Anomalous Events in Object-centric Business Processes via
Graph Neural Networks [55.583478485027]
This study proposes a novel framework for anomaly detection in business processes.
We first reconstruct the process dependencies of the object-centric event logs as attributed graphs.
We then employ a graph convolutional autoencoder architecture to detect anomalous events.
arXiv Detail & Related papers (2024-02-14T14:17:56Z) - LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains.
Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data.
Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z) - Measuring Rule-based LTLf Process Specifications: A Probabilistic
Data-driven Approach [2.5407767658470726]
Declarative process specifications define the behavior of processes by means of rules based on Linear Temporal Logic on Finite Traces.
In a mining context, these specifications are inferred from, and checked on, multi-sets of runs recorded by information systems.
We propose a technique that measures the degree of satisfaction of specifications over event logs.
arXiv Detail & Related papers (2023-05-09T13:07:01Z) - Attribute-preserving Face Dataset Anonymization via Latent Code
Optimization [64.4569739006591]
We present a task-agnostic anonymization procedure that directly optimize the images' latent representation in the latent space of a pre-trained GAN.
We demonstrate through a series of experiments that our method is capable of anonymizing the identity of the images whilst -- crucially -- better-preserving the facial attributes.
arXiv Detail & Related papers (2023-03-20T17:34:05Z) - Tag-Based Attention Guided Bottom-Up Approach for Video Instance
Segmentation [83.13610762450703]
Video instance is a fundamental computer vision task that deals with segmenting and tracking object instances across a video sequence.
We introduce a simple end-to-end train bottomable-up approach to achieve instance mask predictions at the pixel-level granularity, instead of the typical region-proposals-based approach.
Our method provides competitive results on YouTube-VIS and DAVIS-19 datasets, and has minimum run-time compared to other contemporary state-of-the-art performance methods.
arXiv Detail & Related papers (2022-04-22T15:32:46Z) - FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable.
We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z) - Reliable Shot Identification for Complex Event Detection via
Visual-Semantic Embedding [72.9370352430965]
We propose a visual-semantic guided loss method for event detection in videos.
Motivated by curriculum learning, we introduce a negative elastic regularization term to start training the classifier with instances of high reliability.
An alternative optimization algorithm is developed to solve the proposed challenging non-net regularization problem.
arXiv Detail & Related papers (2021-10-12T11:46:56Z) - SaCoFa: Semantics-aware Control-flow Anonymization for Process Mining [4.806322013167162]
We argue for privacy preservation that incorporates a process semantics.
We show how, based on the exponential mechanism, semantic constraints are incorporated to ensure differential privacy of the query result.
arXiv Detail & Related papers (2021-09-17T12:26:49Z) - PROVED: A Tool for Graph Representation and Analysis of Uncertain Event
Data [0.966840768820136]
The discipline of process mining aims to study processes in a data-driven manner by analyzing historical process executions.
Recent novel types of event data have gathered interest among the process mining community, including uncertain event data.
The PROVED tool helps to explore, navigate and analyze such uncertain event data.
arXiv Detail & Related papers (2021-03-09T17:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.