A Query Language for Summarizing and Analyzing Business Process Data
- URL: http://arxiv.org/abs/2105.10911v1
- Date: Sun, 23 May 2021 11:07:53 GMT
- Title: A Query Language for Summarizing and Analyzing Business Process Data
- Authors: Amin Beheshti, Boualem Benatallah, Hamid Reza Motahari-Nezhad, Samira
Ghodratnama, Farhad Amouzgar
- Abstract summary: We present a framework to model process data as graphs, i.e., Process Graph, and present abstractions to summarize the process graph.
We have implemented a scalable architecture for querying, exploration and analysis of process graphs.
- Score: 6.952242545832663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern enterprises, Business Processes (BPs) are realized over a mix of
workflows, IT systems, Web services and direct collaborations of people.
Accordingly, process data (i.e., BP execution data such as logs containing
events, interaction messages and other process artifacts) is scattered across
several systems and data sources, and increasingly show all typical properties
of the Big Data. Understanding the execution of process data is challenging as
key business insights remain hidden in the interactions among process entities:
most objects are interconnected, forming complex, heterogeneous but often
semi-structured networks. In the context of business processes, we consider the
Big Data problem as a massive number of interconnected data islands from
personal, shared and business data. We present a framework to model process
data as graphs, i.e., Process Graph, and present abstractions to summarize the
process graph and to discover concept hierarchies for entities based on both
data objects and their interactions in process graphs. We present a language,
namely BP-SPARQL, for the explorative querying and understanding of process
graphs from various user perspectives. We have implemented a scalable
architecture for querying, exploration and analysis of process graphs. We
report on experiments performed on both synthetic and real-world datasets that
show the viability and efficiency of the approach.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - CMDBench: A Benchmark for Coarse-to-fine Multimodal Data Discovery in Compound AI Systems [10.71630696651595]
Compound AI systems (CASs) that employ LLMs as agents to accomplish knowledge-intensive tasks have garnered significant interest within database and AI communities.
silos of multimodal data sources make it difficult to identify appropriate data sources for accomplishing the task at hand.
We propose CMDBench, a benchmark modeling the complexity of enterprise data platforms.
arXiv Detail & Related papers (2024-06-02T01:10:41Z) - Federated Neural Graph Databases [53.03085605769093]
We propose Federated Neural Graph Database (FedNGDB), a novel framework that enables reasoning over multi-source graph-based data while preserving privacy.
Unlike existing methods, FedNGDB can handle complex graph structures and relationships, making it suitable for various downstream tasks.
arXiv Detail & Related papers (2024-02-22T14:57:44Z) - Pathway: a fast and flexible unified stream data processing framework
for analytical and Machine Learning applications [7.850979932441607]
Pathway is a new unified data processing framework that can run workloads on both bounded and unbounded data streams.
We describe the system and present benchmarking results which demonstrate its capabilities in both batch and streaming contexts.
arXiv Detail & Related papers (2023-07-12T08:27:37Z) - Demonstration of InsightPilot: An LLM-Empowered Automated Data
Exploration System [48.62158108517576]
We introduce InsightPilot, an automated data exploration system designed to simplify the data exploration process.
InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining.
In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts.
arXiv Detail & Related papers (2023-04-02T07:27:49Z) - Modeling Entities as Semantic Points for Visual Information Extraction
in the Wild [55.91783742370978]
We propose an alternative approach to precisely and robustly extract key information from document images.
We explicitly model entities as semantic points, i.e., center points of entities are enriched with semantic information describing the attributes and relationships of different entities.
The proposed method can achieve significantly enhanced performance on entity labeling and linking, compared with previous state-of-the-art models.
arXiv Detail & Related papers (2023-03-23T08:21:16Z) - Analytical Engines With Context-Rich Processing: Towards Efficient
Next-Generation Analytics [12.317930859033149]
We envision an analytical engine co-optimized with components that enable context-rich analysis.
We aim for a holistic pipeline cost- and rule-based optimization across relational and model-based operators.
arXiv Detail & Related papers (2022-12-14T21:46:33Z) - Accessing and Interpreting OPC UA Event Traces based on Semantic Process
Descriptions [69.9674326582747]
This paper proposes an approach to access a production systems' event data based on the event data's context.
The approach extracts filtered event logs from a database system by combining: 1) a semantic model of a production system's hierarchical structure, 2) a formalized process description and 3) an OPC UA information model.
arXiv Detail & Related papers (2022-07-25T15:13:44Z) - Process-BERT: A Framework for Representation Learning on Educational
Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z) - Towards an Integrated Platform for Big Data Analysis [4.5257812998381315]
This paper presents the vision of an integrated plat-form for big data analysis that combines all these aspects.
Main benefits of this approach are an enhanced scalability of the whole platform, a better parameterization of algorithms, and an improved usability during the end-to-end data analysis process.
arXiv Detail & Related papers (2020-04-27T03:15:23Z) - A Common Operating Picture Framework Leveraging Data Fusion and Deep
Learning [0.7348448478819135]
We present a data fusion framework for accelerating solutions for Processing, Exploitation, and Dissemination.
Our platform is a collection of services that extract information from several data sources by leveraging deep learning and other means of processing.
In our first iteration we have focused on visual data (FMV, WAMI, CCTV/PTZ-Cameras, open source video, etc.) and AIS data streams (satellite and terrestrial sources)
arXiv Detail & Related papers (2020-01-16T18:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.