Entity Identifier: A Natural Text Parsing-based Framework For Entity
Relation Extraction
- URL: http://arxiv.org/abs/2307.04892v1
- Date: Mon, 10 Jul 2023 20:30:27 GMT
- Title: Entity Identifier: A Natural Text Parsing-based Framework For Entity
Relation Extraction
- Authors: El Mehdi Chouham, Jessica L\'opez Espejel, Mahaman Sanoussi Yahaya
Alassan, Walid Dahhane, El Hassane Ettifouri
- Abstract summary: We use natural language processing techniques to extract structured information from requirements descriptions.
To facilitate this process, we introduce a pipeline for extracting entity and relation information.
We also create a dataset to evaluate the effectiveness of our approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The field of programming has a diversity of paradigms that are used according
to the working framework. While current neural code generation methods are able
to learn and generate code directly from text, we believe that this approach is
not optimal for certain code tasks, particularly the generation of classes in
an object-oriented project. Specifically, we use natural language processing
techniques to extract structured information from requirements descriptions, in
order to automate the generation of CRUD (Create, Read, Update, Delete) class
code. To facilitate this process, we introduce a pipeline for extracting entity
and relation information, as well as a representation called an "Entity Tree"
to model this information. We also create a dataset to evaluate the
effectiveness of our approach.
Related papers
- Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Informed Named Entity Recognition Decoding for Generative Language
Models [3.5323691899538128]
We propose Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process.
We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results.
arXiv Detail & Related papers (2023-08-15T14:16:29Z) - Efficient Guided Generation for Large Language Models [0.21485350418225244]
We show how the problem of neural text generation can be constructively reformulated in terms of transitions between the states of a finite-state machine.
This framework leads to an efficient approach to guiding text generation with regular expressions and context-free grammars.
arXiv Detail & Related papers (2023-07-19T01:14:49Z) - A Comprehensive Review of State-of-The-Art Methods for Java Code
Generation from Natural Language Text [0.0]
This paper provides a comprehensive review of the evolution and progress of deep learning models in Java code generation task.
We focus on the most important methods and present their merits and limitations, as well as the objective functions used by the community.
arXiv Detail & Related papers (2023-06-10T07:27:51Z) - CodeKGC: Code Language Model for Generative Knowledge Graph Construction [46.220237225553234]
Large generative language model trained on structured data such as code has demonstrated impressive capability in understanding natural language for structural prediction and reasoning tasks.
We develop schema-aware prompts that effectively utilize the semantic structure within the knowledge graph.
Experimental results indicate that the proposed approach can obtain better performance on benchmark datasets compared with baselines.
arXiv Detail & Related papers (2023-04-18T15:12:34Z) - Using Document Similarity Methods to create Parallel Datasets for Code
Translation [60.36392618065203]
Translating source code from one programming language to another is a critical, time-consuming task.
We propose to use document similarity methods to create noisy parallel datasets of code.
We show that these models perform comparably to models trained on ground truth for reasonable levels of noise.
arXiv Detail & Related papers (2021-10-11T17:07:58Z) - Contrastive Learning for Source Code with Structural and Functional
Properties [66.10710134948478]
We present BOOST, a novel self-supervised model to focus pre-training based on the characteristics of source code.
We employ automated, structure-guided code transformation algorithms that generate functionally equivalent code that looks drastically different from the original one.
We train our model in a way that brings the functionally equivalent code closer and distinct code further through a contrastive learning objective.
arXiv Detail & Related papers (2021-10-08T02:56:43Z) - GraphCodeBERT: Pre-training Code Representations with Data Flow [97.00641522327699]
We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code.
We use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement.
arXiv Detail & Related papers (2020-09-17T15:25:56Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.