Issue Report Validation in an Industrial Context
- URL: http://arxiv.org/abs/2311.17662v1
- Date: Wed, 29 Nov 2023 14:24:13 GMT
- Title: Issue Report Validation in an Industrial Context
- Authors: Ethem Utku Aktas, Ebru Cakmak, Mete Cihad Inan, Cemal Yilmaz
- Abstract summary: We work on 1,200 randomly selected issue reports in banking domain, written in Turkish.
We manually label these reports for validity, and extract the relevant patterns indicating that they are invalid.
Using the proposed feature extractors, we utilize a machine learning based approach to predict the issue reports' validity, performing a 0.77 F1-score.
- Score: 1.993607565985189
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective issue triaging is crucial for software development teams to improve
software quality, and thus customer satisfaction. Validating issue reports
manually can be time-consuming, hindering the overall efficiency of the
triaging process. This paper presents an approach on automating the validation
of issue reports to accelerate the issue triaging process in an industrial
set-up. We work on 1,200 randomly selected issue reports in banking domain,
written in Turkish, an agglutinative language, meaning that new words can be
formed with linear concatenation of suffixes to express entire sentences. We
manually label these reports for validity, and extract the relevant patterns
indicating that they are invalid. Since the issue reports we work on are
written in an agglutinative language, we use morphological analysis to extract
the features. Using the proposed feature extractors, we utilize a machine
learning based approach to predict the issue reports' validity, performing a
0.77 F1-score.
Related papers
- FactCheck Editor: Multilingual Text Editor with End-to-End fact-checking [1.985242455423935]
'FactCheck Editor' is an advanced text editor designed to automate fact-checking and correct factual inaccuracies.
It supports over 90 languages and utilizes transformer models to assist humans in the labor-intensive process of fact verification.
arXiv Detail & Related papers (2024-04-30T11:55:20Z) - Cross-lingual Contextualized Phrase Retrieval [63.80154430930898]
We propose a new task formulation of dense retrieval, cross-lingual contextualized phrase retrieval.
We train our Cross-lingual Contextualized Phrase Retriever (CCPR) using contrastive learning.
On the phrase retrieval task, CCPR surpasses baselines by a significant margin, achieving a top-1 accuracy that is at least 13 points higher.
arXiv Detail & Related papers (2024-03-25T14:46:51Z) - MaintainoMATE: A GitHub App for Intelligent Automation of Maintenance
Activities [3.2228025627337864]
Software development projects rely on issue tracking systems at the core of tracking maintenance tasks such as bug reports, and enhancement requests.
The handling of issue-reports is critical and requires thorough scanning of the text entered in an issue-report making it a labor-intensive task.
We present a unified framework called MaintainoMATE, which is capable of automatically categorizing the issue-reports in their respective category and further assigning the issue-reports to a developer with relevant expertise.
arXiv Detail & Related papers (2023-08-31T05:15:42Z) - A Comparative Study of Text Embedding Models for Semantic Text
Similarity in Bug Reports [0.0]
Retrieving similar bug reports from an existing database can help reduce the time and effort required to resolve bugs.
We explored several embedding models such as TF-IDF (Baseline), FastText, Gensim, BERT, and ADA.
Our study provides insights into the effectiveness of different embedding methods for retrieving similar bug reports and highlights the impact of selecting the appropriate one for this task.
arXiv Detail & Related papers (2023-08-17T21:36:56Z) - Automatic Classification of Bug Reports Based on Multiple Text
Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports.
The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered.
Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z) - TAGPRIME: A Unified Framework for Relational Structure Extraction [71.88926365652034]
TAGPRIME is a sequence tagging model that appends priming words about the information of the given condition to the input text.
With the self-attention mechanism in pre-trained language models, the priming words make the output contextualized representations contain more information about the given condition.
Extensive experiments and analyses on three different tasks that cover ten datasets across five different languages demonstrate the generality and effectiveness of TAGPRIME.
arXiv Detail & Related papers (2022-05-25T08:57:46Z) - Automatic Issue Classifier: A Transfer Learning Framework for
Classifying Issue Reports [0.0]
We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
arXiv Detail & Related papers (2022-02-12T21:43:08Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - Conditioned Text Generation with Transfer for Closed-Domain Dialogue
Systems [65.48663492703557]
We show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder.
We introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset.
arXiv Detail & Related papers (2020-11-03T14:06:10Z) - Automatic Extraction of Rules Governing Morphological Agreement [103.78033184221373]
We develop an automated framework for extracting a first-pass grammatical specification from raw text.
We focus on extracting rules describing agreement, a morphosyntactic phenomenon at the core of the grammars of many of the world's languages.
We apply our framework to all languages included in the Universal Dependencies project, with promising results.
arXiv Detail & Related papers (2020-10-02T18:31:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.