Process Mining for Unstructured Data: Challenges and Research Directions
- URL: http://arxiv.org/abs/2401.13677v1
- Date: Thu, 30 Nov 2023 12:09:14 GMT
- Title: Process Mining for Unstructured Data: Challenges and Research Directions
- Authors: Agnes Koschmider, Milda Aleknonyt\.e-Resch, Frederik Fonger, Christian
Imenkamp, Arvid Lepsien, Kaan Apaydin, Maximilian Harms, Dominik Janssen,
Dominic Langhammer, Tobias Ziolkowski, Yorck Zisgen
- Abstract summary: The application of process mining for unstructured data might significantly elevate novel insights into disciplines where unstructured data is a common data format.
To efficiently analyze unstructured data by process mining and to convey confidence into the analysis result, requires bridging multiple challenges.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The application of process mining for unstructured data might significantly
elevate novel insights into disciplines where unstructured data is a common
data format. To efficiently analyze unstructured data by process mining and to
convey confidence into the analysis result, requires bridging multiple
challenges. The purpose of this paper is to discuss these challenges, present
initial solutions and describe future research directions. We hope that this
article lays the foundations for future collaboration on this topic.
Related papers
- Benchmarking Data Science Agents [11.582116078653968]
Large Language Models (LLMs) have emerged as promising aids as data science agents, assisting humans in data analysis and processing.
Yet their practical efficacy remains constrained by the varied demands of real-world applications and complicated analytical process.
We introduce DSEval -- a novel evaluation paradigm, as well as a series of innovative benchmarks tailored for assessing the performance of these agents.
arXiv Detail & Related papers (2024-02-27T03:03:06Z) - Capture the Flag: Uncovering Data Insights with Large Language Models [90.47038584812925]
This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data.
We propose a new evaluation methodology based on a "capture the flag" principle, measuring the ability of such models to recognize meaningful and pertinent information (flags) in a dataset.
arXiv Detail & Related papers (2023-12-21T14:20:06Z) - Incremental hierarchical text clustering methods: a review [49.32130498861987]
This study aims to analyze various hierarchical and incremental clustering techniques.
The main contribution of this research is the organization and comparison of the techniques used by studies published between 2010 and 2018 that aimed to texts documents clustering.
arXiv Detail & Related papers (2023-12-12T22:27:29Z) - Resolving the Imbalance Issue in Hierarchical Disciplinary Topic
Inference via LLM-based Data Augmentation [5.98277339029019]
This study leverages large language models (Llama V1) as data generators to augment research proposals categorized within intricate disciplinary hierarchies.
Our experiments attest to the efficacy of the generated data, demonstrating that research proposals produced using the prompts can effectively address the aforementioned issues.
arXiv Detail & Related papers (2023-10-09T00:45:20Z) - Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications.
We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
arXiv Detail & Related papers (2023-08-21T15:35:16Z) - Boosting Event Extraction with Denoised Structure-to-Text Augmentation [52.21703002404442]
Event extraction aims to recognize pre-defined event triggers and arguments from texts.
Recent data augmentation methods often neglect the problem of grammatical incorrectness.
We propose a denoised structure-to-text augmentation framework for event extraction DAEE.
arXiv Detail & Related papers (2023-05-16T16:52:07Z) - Controllable Data Generation by Deep Learning: A Review [22.582082771890974]
controllable deep data generation is a promising research area, commonly known as controllable deep data generation.
This article introduces exciting applications of controllable deep data generation, experimentally analyzes and compares existing works.
arXiv Detail & Related papers (2022-07-19T20:44:42Z) - Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z) - Deep Learning Schema-based Event Extraction: Literature Review and
Current Trends [60.29289298349322]
Event extraction technology based on deep learning has become a research hotspot.
This paper fills the gap by reviewing the state-of-the-art approaches, focusing on deep learning-based models.
arXiv Detail & Related papers (2021-07-05T16:32:45Z) - Complex Sequential Data Analysis: A Systematic Literature Review of
Existing Algorithms [0.9649642656207869]
This paper reviews past approaches to the use of deep-learning frameworks for the analysis of irregular-patterned datasets.
Traditional deep-learning methods perform poorly or even fail when trying to analyse these datasets.
The performance of deep-learning frameworks was found to be evaluated mainly using mean absolute error and root mean square error accuracy metrics.
arXiv Detail & Related papers (2020-07-22T17:53:00Z) - Towards an Integrated Platform for Big Data Analysis [4.5257812998381315]
This paper presents the vision of an integrated plat-form for big data analysis that combines all these aspects.
Main benefits of this approach are an enhanced scalability of the whole platform, a better parameterization of algorithms, and an improved usability during the end-to-end data analysis process.
arXiv Detail & Related papers (2020-04-27T03:15:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.