PET: A new Dataset for Process Extraction from Natural Language Text
- URL: http://arxiv.org/abs/2203.04860v1
- Date: Wed, 9 Mar 2022 16:33:59 GMT
- Title: PET: A new Dataset for Process Extraction from Natural Language Text
- Authors: Patrizio Bellan, Han van der Aa, Mauro Dragoni, Chiara Ghidini and
Simone Paolo Ponzetto
- Abstract summary: We develop the first corpus of business process descriptions annotated with activities, gateways, actors and flow information.
We present our new resource, including a detailed overview of the annotation schema and guidelines, as well as a variety of baselines to benchmark the difficulty and challenges of business process extraction from text.
- Score: 15.16406344719132
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although there is a long tradition of work in NLP on extracting entities and
relations from text, to date there exists little work on the acquisition of
business processes from unstructured data such as textual corpora of process
descriptions. With this work we aim at filling this gap and establishing the
first steps towards bridging data-driven information extraction methodologies
from Natural Language Processing and the model-based formalization that is
aimed from Business Process Management. For this, we develop the first corpus
of business process descriptions annotated with activities, gateways, actors
and flow information. We present our new resource, including a detailed
overview of the annotation schema and guidelines, as well as a variety of
baselines to benchmark the difficulty and challenges of business process
extraction from text.
Related papers
- Natural Language Processing for Requirements Traceability [47.93107382627423]
Traceability plays a crucial role in requirements and software engineering, particularly for safety-critical systems.
Natural language processing (NLP) and related techniques have made considerable progress in the past decade.
arXiv Detail & Related papers (2024-05-17T15:17:00Z) - From Dialogue to Diagram: Task and Relationship Extraction from Natural
Language for Accelerated Business Process Prototyping [0.0]
This paper introduces a contemporary solution, where central to our approach, is the use of dependency parsing and Named Entity Recognition (NER)
We utilize Subject-Verb-Object (SVO) constructs for identifying action relationships and integrate semantic analysis tools, including WordNet, for enriched contextual understanding.
The system adeptly handles data transformation and visualization, converting verbose extracted information into BPMN (Business Process Model and Notation) diagrams.
arXiv Detail & Related papers (2023-12-16T12:35:28Z) - Beyond Rule-based Named Entity Recognition and Relation Extraction for
Process Model Generation from Natural Language Text [0.0]
We present an extension to an existing pipeline to make it entirely data driven.
We demonstrate the competitiveness of our improved pipeline, which not only eliminates the substantial overhead associated with feature engineering and rule definition.
We propose an extension to the PET dataset that incorporates information about linguistic references and a corresponding method for resolving them.
arXiv Detail & Related papers (2023-05-06T07:06:47Z) - Improving Keyphrase Extraction with Data Augmentation and Information
Filtering [67.43025048639333]
Keyphrase extraction is one of the essential tasks for document understanding in NLP.
We present a novel corpus and method for keyphrase extraction from the videos streamed on the Behance platform.
arXiv Detail & Related papers (2022-09-11T22:38:02Z) - TAGPRIME: A Unified Framework for Relational Structure Extraction [71.88926365652034]
TAGPRIME is a sequence tagging model that appends priming words about the information of the given condition to the input text.
With the self-attention mechanism in pre-trained language models, the priming words make the output contextualized representations contain more information about the given condition.
Extensive experiments and analyses on three different tasks that cover ten datasets across five different languages demonstrate the generality and effectiveness of TAGPRIME.
arXiv Detail & Related papers (2022-05-25T08:57:46Z) - Zero-Shot Information Extraction as a Unified Text-to-Triple Translation [56.01830747416606]
We cast a suite of information extraction tasks into a text-to-triple translation framework.
We formalize the task as a translation between task-specific input text and output triples.
We study the zero-shot performance of this framework on open information extraction.
arXiv Detail & Related papers (2021-09-23T06:54:19Z) - Extracting Semantic Process Information from the Natural Language in
Event Logs [0.1827510863075184]
We present an approach that achieves this through so-called semantic role labeling of event data.
In this manner, our approach extracts information about up to eight semantic roles per event.
arXiv Detail & Related papers (2021-03-06T08:39:04Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z) - Automatic Business Process Structure Discovery using Ordered Neurons
LSTM: A Preliminary Study [6.6599132213053185]
We propose to retrieve latent semantic hierarchical structure present in business process documents by building a neural network.
We tested the proposed approach on data set of Process Description Documents (PDD) from our practical Robotic Process Automation (RPA) projects.
arXiv Detail & Related papers (2020-01-05T14:19:11Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.