Related papers: ProcK: Machine Learning for Knowledge-Intensive Processes

ProcK: Machine Learning for Knowledge-Intensive Processes

URL: http://arxiv.org/abs/2109.04881v1
Date: Fri, 10 Sep 2021 13:51:59 GMT
Title: ProcK: Machine Learning for Knowledge-Intensive Processes
Authors: Tobias Jacobs, Jingyi Yu, Julia Gastinger, Timo Sztyler
Abstract summary: ProcK (Process & Knowledge) is a novel pipeline to build business process prediction models. Components to extract inter-linked event logs and knowledge bases from relational databases are part of the pipeline. We demonstrate the power of ProcK by training it for prediction tasks on the OULAD e-learning dataset.
Score: 30.371382331613532
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Process mining deals with extraction of knowledge from business process execution logs. Traditional process mining tasks, like process model generation or conformance checking, rely on a minimalistic feature set where each event is characterized only by its case identifier, activity type, and timestamp. In contrast, the success of modern machine learning is based on models that take any available data as direct input and build layers of features automatically during training. In this work, we introduce ProcK (Process & Knowledge), a novel pipeline to build business process prediction models that take into account both sequential data in the form of event logs and rich semantic information represented in a graph-structured knowledge base. The hybrid approach enables ProcK to flexibly make use of all information residing in the databases of organizations. Components to extract inter-linked event logs and knowledge bases from relational databases are part of the pipeline. We demonstrate the power of ProcK by training it for prediction tasks on the OULAD e-learning dataset, where we achieve state-of-the-art performance on the tasks of predicting student dropout from courses and predicting their success. We also apply our method on a number of additional machine learning tasks, including exam score prediction and early predictions that only take into account data recorded during the first weeks of the courses.

Related papers

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training [51.60874286674908]
We focus on predicting performance on Closed-book Question Answering (CBQA) tasks, which are closely tied to pre-training data and knowledge retention. We address three major challenges: 1) mastering the entire pre-training process, especially data construction; 2) evaluating a model's knowledge retention; and 3) predicting task-specific knowledge retention using only information available prior to training. We introduce the SMI metric, an information-theoretic measure that quantifies the relationship between pre-training data, model size, and task-specific knowledge retention.
arXiv Detail & Related papers (2025-02-06T13:23:53Z)
Process-aware Human Activity Recognition [1.912429179274357]
We propose a novel approach that incorporates process information from context to enhance the HAR performance. Specifically, we align probabilistic events generated by machine learning models with process models derived from contextual information. This alignment adaptively weighs these two sources of information to optimise HAR accuracy.
arXiv Detail & Related papers (2024-11-13T17:53:23Z)
PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT. On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt. On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z)
Process-BERT: A Framework for Representation Learning on Educational Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data. Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data. We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Multivariate Business Process Representation Learning utilizing Gramian Angular Fields and Convolutional Neural Networks [0.0]
Learning meaningful representations of data is an important aspect of machine learning. For predictive process analytics, it is essential to have all explanatory characteristics of a process instance available. We propose a novel approach for representation learning of business process instances.
arXiv Detail & Related papers (2021-06-15T10:21:14Z)
PROVED: A Tool for Graph Representation and Analysis of Uncertain Event Data [0.966840768820136]
The discipline of process mining aims to study processes in a data-driven manner by analyzing historical process executions. Recent novel types of event data have gathered interest among the process mining community, including uncertain event data. The PROVED tool helps to explore, navigate and analyze such uncertain event data.
arXiv Detail & Related papers (2021-03-09T17:11:54Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
Towards Intelligent Risk-based Customer Segmentation in Banking [0.0]
We present an intelligent data-driven pipeline composed of a set of processing elements to move customers' data from one system to another. The goal is to present a novel intelligent customer segmentation process which automates the feature engineering, i.e., the process of using (banking) domain knowledge to extract features from raw data. Our proposed method is able to achieve accuracy of 91% compared to classical approaches in terms of detecting, identifying and classifying transaction to the right classification.
arXiv Detail & Related papers (2020-09-29T11:22:04Z)
Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process. Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved. We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z)
Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data. In this paper, we propose to use (block-) structured programs directly as target process models. We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.