The Impact of Event Data Partitioning on Privacy-aware Process Discovery
- URL: http://arxiv.org/abs/2507.06008v1
- Date: Tue, 08 Jul 2025 14:13:44 GMT
- Title: The Impact of Event Data Partitioning on Privacy-aware Process Discovery
- Authors: Jungeun Lim, Stephan A. Fahrenkrog-Petersen, Xixi Lu, Jan Mendling, Minseok Song,
- Abstract summary: We propose a pipeline that combines anonymization and event data partitioning.<n>We study the impact of event partitioning on two anonymization techniques.
- Score: 4.578440119454756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information systems support the execution of business processes. The event logs of these executions generally contain sensitive information about customers, patients, and employees. The corresponding privacy challenges can be addressed by anonymizing the event logs while still retaining utility for process discovery. However, trading off utility and privacy is difficult: the higher the complexity of event log, the higher the loss of utility by anonymization. In this work, we propose a pipeline that combines anonymization and event data partitioning, where event abstraction is utilized for partitioning. By leveraging event abstraction, event logs can be segmented into multiple parts, allowing each sub-log to be anonymized separately. This pipeline preserves privacy while mitigating the loss of utility. To validate our approach, we study the impact of event partitioning on two anonymization techniques using three real-world event logs and two process discovery techniques. Our results demonstrate that event partitioning can bring improvements in process discovery utility for directly-follows-based anonymization techniques.
Related papers
- Privacy-Preserving Anonymization of System and Network Event Logs Using Salt-Based Hashing and Temporal Noise [5.85293491327449]
Event logs contain Personally Identifiable Information (PII)<n>Overly aggressive anonymization can destroy contextual integrity, while weak techniques risk re-identification through linkage or inference attacks.<n>This paper introduces novel field-specific anonymization methods that address this trade-off.
arXiv Detail & Related papers (2025-07-29T15:16:42Z) - Collaborative Inference over Wireless Channels with Feature Differential Privacy [57.68286389879283]
Collaborative inference among multiple wireless edge devices has the potential to significantly enhance Artificial Intelligence (AI) applications.
transmitting extracted features poses a significant privacy risk, as sensitive personal data can be exposed during the process.
We propose a novel privacy-preserving collaborative inference mechanism, wherein each edge device in the network secures the privacy of extracted features before transmitting them to a central server for inference.
arXiv Detail & Related papers (2024-10-25T18:11:02Z) - Adaptive Differentially Private Structural Entropy Minimization for Unsupervised Social Event Detection [29.13690542566747]
Social event detection is important in numerous areas, such as opinion analysis, social safety, and decision-making.
Most current methods are supervised and require access to large amounts of data.
We propose ADP-SEMEvent, an unsupervised social event detection method that prioritizes privacy.
arXiv Detail & Related papers (2024-07-23T11:19:22Z) - Double Mixture: Towards Continual Event Detection from Speech [60.33088725100812]
Speech event detection is crucial for multimedia retrieval, involving the tagging of both semantic and acoustic events.
This paper tackles two primary challenges in speech event detection: the continual integration of new events without forgetting previous ones, and the disentanglement of semantic from acoustic events.
We propose a novel method, 'Double Mixture,' which merges speech expertise with robust memory mechanisms to enhance adaptability and prevent forgetting.
arXiv Detail & Related papers (2024-04-20T06:32:00Z) - Detecting Anomalous Events in Object-centric Business Processes via
Graph Neural Networks [55.583478485027]
This study proposes a novel framework for anomaly detection in business processes.
We first reconstruct the process dependencies of the object-centric event logs as attributed graphs.
We then employ a graph convolutional autoencoder architecture to detect anomalous events.
arXiv Detail & Related papers (2024-02-14T14:17:56Z) - Resolving Uncertain Case Identifiers in Interaction Logs: A User Study [0.4014524824655105]
We propose a neural network-based technique to determine a case notion for click data.
We validate its efficacy through a user study based on the segmented event log resulting from interaction data of a mobility sharing company.
arXiv Detail & Related papers (2022-11-21T16:13:04Z) - Avoiding Post-Processing with Event-Based Detection in Biomedical
Signals [69.34035527763916]
We propose an event-based modeling framework that directly works with events as learning targets.
We show that event-based modeling (without post-processing) performs on par with or better than epoch-based modeling with extensive post-processing.
arXiv Detail & Related papers (2022-09-22T13:44:13Z) - Accessing and Interpreting OPC UA Event Traces based on Semantic Process
Descriptions [69.9674326582747]
This paper proposes an approach to access a production systems' event data based on the event data's context.
The approach extracts filtered event logs from a database system by combining: 1) a semantic model of a production system's hierarchical structure, 2) a formalized process description and 3) an OPC UA information model.
arXiv Detail & Related papers (2022-07-25T15:13:44Z) - SaCoFa: Semantics-aware Control-flow Anonymization for Process Mining [4.806322013167162]
We argue for privacy preservation that incorporates a process semantics.
We show how, based on the exponential mechanism, semantic constraints are incorporated to ensure differential privacy of the query result.
arXiv Detail & Related papers (2021-09-17T12:26:49Z) - A Distance Measure for Privacy-preserving Process Mining based on
Feature Learning [5.250561515565923]
We show how embeddings of events enable the definition of a distance measure for traces to guide event log anonymization.
Our experiments with real-world data indicate that anonymization using this measure, compared to a syntactic one, yields logs that are closer to the original log in various dimensions.
arXiv Detail & Related papers (2021-07-14T09:44:28Z) - "What Are You Trying to Do?" Semantic Typing of Event Processes [94.3499255880101]
This paper studies a new cognitively motivated semantic typing task, multi-axis event process typing.
We develop a large dataset containing over 60k event processes, featuring ultra fine-grained typing on both the action and object type axes.
We propose a hybrid learning framework, P2GT, which addresses the challenging typing problem with indirect supervision from glosses1and a joint learning-to-rank framework.
arXiv Detail & Related papers (2020-10-13T22:37:29Z) - BeeTrace: A Unified Platform for Secure Contact Tracing that Breaks Data
Silos [73.84437456144994]
Contact tracing is an important method to control the spread of an infectious disease such as COVID-19.
Current solutions do not utilize the huge volume of data stored in business databases and individual digital devices.
We propose BeeTrace, a unified platform that breaks data silos and deploys state-of-the-art cryptographic protocols to guarantee privacy goals.
arXiv Detail & Related papers (2020-07-05T10:33:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.