The Use of NLP-Based Text Representation Techniques to Support
Requirement Engineering Tasks: A Systematic Mapping Review
- URL: http://arxiv.org/abs/2206.00421v1
- Date: Tue, 17 May 2022 02:47:26 GMT
- Title: The Use of NLP-Based Text Representation Techniques to Support
Requirement Engineering Tasks: A Systematic Mapping Review
- Authors: Riad Sonbol, Ghaida Rebdawi, Nada Ghneim
- Abstract summary: The research direction has changed from the use of lexical and syntactic features to the use of advanced embedding techniques.
We identify four gaps in the existing literature, why they matter, and how future research can begin to address them.
- Score: 1.5469452301122177
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Natural Language Processing (NLP) is widely used to support the automation of
different Requirements Engineering (RE) tasks. Most of the proposed approaches
start with various NLP steps that analyze requirements statements, extract
their linguistic information, and convert them to easy-to-process
representations, such as lists of features or embedding-based vector
representations. These NLP-based representations are usually used at a later
stage as inputs for machine learning techniques or rule-based methods. Thus,
requirements representations play a major role in determining the accuracy of
different approaches. In this paper, we conducted a survey in the form of a
systematic literature mapping (classification) to find out (1) what are the
representations used in RE tasks literature, (2) what is the main focus of
these works, (3) what are the main research directions in this domain, and (4)
what are the gaps and potential future directions. After compiling an initial
pool of 2,227 papers, and applying a set of inclusion/exclusion criteria, we
obtained a final pool containing 104 relevant papers. Our survey shows that the
research direction has changed from the use of lexical and syntactic features
to the use of advanced embedding techniques, especially in the last two years.
Using advanced embedding representations has proved its effectiveness in most
RE tasks (such as requirement analysis, extracting requirements from reviews
and forums, and semantic-level quality tasks). However, representations that
are based on lexical and syntactic features are still more appropriate for
other RE tasks (such as modeling and syntax-level quality tasks) since they
provide the required information for the rules and regular expressions used
when handling these tasks. In addition, we identify four gaps in the existing
literature, why they matter, and how future research can begin to address them.
Related papers
- Natural Language Processing for Requirements Traceability [47.93107382627423]
Traceability plays a crucial role in requirements and software engineering, particularly for safety-critical systems.
Natural language processing (NLP) and related techniques have made considerable progress in the past decade.
arXiv Detail & Related papers (2024-05-17T15:17:00Z) - Tasks People Prompt: A Taxonomy of LLM Downstream Tasks in Software Verification and Falsification Approaches [2.687757575672707]
We develop a novel downstream-task taxonomy to perform classification, mapping, and analysis.
The main taxonomy requirement is to highlight commonalities while exhibiting variation points of task types.
arXiv Detail & Related papers (2024-04-14T23:45:23Z) - Bridging Research and Readers: A Multi-Modal Automated Academic Papers
Interpretation System [47.13932723910289]
We introduce an open-source multi-modal automated academic paper interpretation system (MMAPIS) with three-step process stages.
It employs the hybrid modality preprocessing and alignment module to extract plain text, and tables or figures from documents separately.
It then aligns this information based on the section names they belong to, ensuring that data with identical section names are categorized under the same section.
It utilizes the extracted section names to divide the article into shorter text segments, facilitating specific summarizations both within and between sections via LLMs.
arXiv Detail & Related papers (2024-01-17T11:50:53Z) - Practical Guidelines for the Selection and Evaluation of Natural Language Processing Techniques in Requirements Engineering [8.779031107963942]
Natural language (NL) is now a cornerstone of requirements automation.
With so many different NLP solution strategies available, it can be challenging to choose the right strategy for a specific RE task.
In particular, we discuss how to choose among different strategies such as traditional NLP, feature-based machine learning, and language-model-based methods.
arXiv Detail & Related papers (2024-01-03T02:24:35Z) - Natural Language Processing for Requirements Formalization: How to
Derive New Approaches? [0.32885740436059047]
We present and discuss principal ideas and state-of-the-art methodologies from the field of NLP.
We discuss two different approaches in detail and highlight the iterative development of rule sets.
The presented methods are demonstrated on two industrial use cases from the automotive and railway domains.
arXiv Detail & Related papers (2023-09-23T05:45:19Z) - Requirement Formalisation using Natural Language Processing and Machine
Learning: A Systematic Review [11.292853646607888]
We conducted a systematic literature review to outline the current state-of-the-art of NLP and ML techniques in Requirement Engineering.
We found that NLP approaches are the most common NLP techniques used for automatic RF, primary operating on structured and semi-structured data.
This study also revealed that Deep Learning (DL) technique are not widely used, instead classical ML techniques are predominant in the surveyed studies.
arXiv Detail & Related papers (2023-03-18T17:36:21Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - Recitation-Augmented Language Models [85.30591349383849]
We show that RECITE is a powerful paradigm for knowledge-intensive NLP tasks.
Specifically, we show that by utilizing recitation as the intermediate step, a recite-and-answer scheme can achieve new state-of-the-art performance.
arXiv Detail & Related papers (2022-10-04T00:49:20Z) - QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse.
We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z) - Exploring Multi-Modal Representations for Ambiguity Detection &
Coreference Resolution in the SIMMC 2.0 Challenge [60.616313552585645]
We present models for effective Ambiguity Detection and Coreference Resolution in Conversational AI.
Specifically, we use TOD-BERT and LXMERT based models, compare them to a number of baselines and provide ablation experiments.
Our results show that (1) language models are able to exploit correlations in the data to detect ambiguity; and (2) unimodal coreference resolution models can avoid the need for a vision component.
arXiv Detail & Related papers (2022-02-25T12:10:02Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.