IoT Miner: Intelligent Extraction of Event Logs from Sensor Data for Process Mining
- URL: http://arxiv.org/abs/2509.05769v1
- Date: Sat, 06 Sep 2025 16:50:33 GMT
- Title: IoT Miner: Intelligent Extraction of Event Logs from Sensor Data for Process Mining
- Authors: Edyta Brzychczy, Urszula Jessen, Krzysztof Kluza, Sridhar Sriram, Manuel Vargas Nettelnstroth,
- Abstract summary: IoT Miner is a framework for creating high-level event logs from raw industrial sensor data to support process mining.<n>By combining AI with domain-aware data processing, IoT Miner offers a scalable and interpretable method for generating event logs from IoT data.
- Score: 1.118478900782898
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents IoT Miner, a novel framework for automatically creating high-level event logs from raw industrial sensor data to support process mining. In many real-world settings, such as mining or manufacturing, standard event logs are unavailable, and sensor data lacks the structure and semantics needed for analysis. IoT Miner addresses this gap using a four-stage pipeline: data preprocessing, unsupervised clustering, large language model (LLM)-based labeling, and event log construction. A key innovation is the use of LLMs to generate meaningful activity labels from cluster statistics, guided by domain-specific prompts. We evaluate the approach on sensor data from a Load-Haul-Dump (LHD) mining machine and introduce a new metric, Similarity-Weighted Accuracy, to assess labeling quality. Results show that richer prompts lead to more accurate and consistent labels. By combining AI with domain-aware data processing, IoT Miner offers a scalable and interpretable method for generating event logs from IoT data, enabling process mining in settings where traditional logs are missing.
Related papers
- A Domain-specific Language and Architecture for Detecting Process Activities from Sensor Streams in IoT [0.0]
Internet of Things (IoT) systems are equipped with a plethora of sensors providing real-time data about the current operations of their components.<n>These data are often too fine-grained to derive useful insights into the execution of the larger processes an IoT system might be part of.<n>Process mining has developed advanced approaches for the analysis of business processes that may also be used in the context of IoT.
arXiv Detail & Related papers (2025-07-01T11:38:33Z) - An object-centric core metamodel for IoT-enhanced event logs [1.092202156339801]
We present a core model synthesizing the most important features of existing data models.<n>A prototypical Python implementation is used to evaluate the model against various use cases.
arXiv Detail & Related papers (2025-06-26T14:19:44Z) - Exploring Microstructural Dynamics in Cryptocurrency Limit Order Books: Better Inputs Matter More Than Stacking Another Hidden Layer [9.2463347238923]
We aim to examine whether adding extra hidden layers or parameters to "blackbox ish" neural networks genuinely enhances short term price forecasting.<n>We benchmark a spectrum of models from interpretable baselines, logistic regression, XGBoost to deep architectures (DeepLOB, Conv1D+LSTM) on BTC/USDT LOB snapshots sampled at 100 ms to multi second intervals using publicly available Bybit data.
arXiv Detail & Related papers (2025-06-06T05:43:30Z) - SnipGen: A Mining Repository Framework for Evaluating LLMs for Code [51.07471575337676]
Language Models (LLMs) are trained on extensive datasets that include code repositories.<n> evaluating their effectiveness poses significant challenges due to the potential overlap between the datasets used for training and those employed for evaluation.<n>We introduce SnipGen, a comprehensive repository mining framework designed to leverage prompt engineering across various downstream tasks for code generation.
arXiv Detail & Related papers (2025-02-10T21:28:15Z) - LLM-based event abstraction and integration for IoT-sourced logs [2.6811507121199325]
In this paper, we shed light on the potential of leveraging Large Language Models (LLMs) in event abstraction and integration.
Our approach aims to create event records from raw sensor readings and merge the logs from multiple IoT sources into a single event log.
We demonstrate the capabilities of LLMs in event abstraction considering a case study for IoT application in elderly care and longitudinal health monitoring.
arXiv Detail & Related papers (2024-09-05T12:38:13Z) - An Automated Approach to Collecting and Labeling Time Series Data for Event Detection Using Elastic Node Hardware [18.15754187896287]
This paper introduces a novel embedded system designed to autonomously label sensor data directly on IoT devices.
We present an integrated hardware and software solution equipped with specialized labeling sensors that streamline the capture and labeling of diverse types of sensor data.
arXiv Detail & Related papers (2024-07-06T15:19:16Z) - From Internet of Things Data to Business Processes: Challenges and a Framework [2.9799866120078935]
The IoT and Business Process Management (BPM) communities co-exist in many shared application domains, such as manufacturing and healthcare.
This work proposes a framework to perform a set of structured steps to convert low-level IoT sensor data into higher-level process events.
arXiv Detail & Related papers (2024-05-14T12:07:07Z) - LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection [73.69399219776315]
We propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains.
Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data.
Then, we transfer such knowledge to the target domain via shared parameters.
arXiv Detail & Related papers (2024-01-09T12:55:21Z) - Deep Learning based pipeline for anomaly detection and quality
enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space.
This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z) - Robust Event Classification Using Imperfect Real-world PMU Data [58.26737360525643]
We study robust event classification using imperfect real-world phasor measurement unit (PMU) data.
We develop a novel machine learning framework for training robust event classifiers.
arXiv Detail & Related papers (2021-10-19T17:41:43Z) - Federated Learning with Correlated Data: Taming the Tail for Age-Optimal
Industrial IoT [55.62157530259969]
We study a sensor's transmit power minimization subject to the peak-AoI requirement and a probabilistic constraint on queuing latency.
We propose a local-model selection approach which accounts for correlation among the sensor's training data.
Numerical results show the tradeoff between the transmit power, peak AoI, and delay's tail distribution.
arXiv Detail & Related papers (2021-08-17T08:38:31Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.