Related papers: Issue Report Validation in an Industrial Context

Issue Report Validation in an Industrial Context

URL: http://arxiv.org/abs/2311.17662v1
Date: Wed, 29 Nov 2023 14:24:13 GMT
Title: Issue Report Validation in an Industrial Context
Authors: Ethem Utku Aktas, Ebru Cakmak, Mete Cihad Inan, Cemal Yilmaz
Abstract summary: We work on 1,200 randomly selected issue reports in banking domain, written in Turkish. We manually label these reports for validity, and extract the relevant patterns indicating that they are invalid. Using the proposed feature extractors, we utilize a machine learning based approach to predict the issue reports' validity, performing a 0.77 F1-score.
Score: 1.993607565985189
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Effective issue triaging is crucial for software development teams to improve software quality, and thus customer satisfaction. Validating issue reports manually can be time-consuming, hindering the overall efficiency of the triaging process. This paper presents an approach on automating the validation of issue reports to accelerate the issue triaging process in an industrial set-up. We work on 1,200 randomly selected issue reports in banking domain, written in Turkish, an agglutinative language, meaning that new words can be formed with linear concatenation of suffixes to express entire sentences. We manually label these reports for validity, and extract the relevant patterns indicating that they are invalid. Since the issue reports we work on are written in an agglutinative language, we use morphological analysis to extract the features. Using the proposed feature extractors, we utilize a machine learning based approach to predict the issue reports' validity, performing a 0.77 F1-score.

Related papers

Identifying Concurrency Bug Reports via Linguistic Patterns [5.794959117360381]
This paper presents a linguistic-pattern-based framework for automatically identifying bug reports.<n>We derive 58 distinct linguistic patterns from 730 manually labeled bug reports.<n>We evaluate four complementary approaches-matching, learning, prompt-based, and fine-tuning-spanning traditional machine learning, large language models, and pre-trained language models.
arXiv Detail & Related papers (2026-01-22T21:54:14Z)
Towards Terminology Management Automation for Arabic [0.0]
This paper presents a method and supporting tools for automation of terminology management for Arabic. The tools extract lists of parallel terminology matching terms in foreign languages to their Arabic counterparts from field specific texts. This has significant implications as it can be used to improve consistent translation and use of terms in specialized Arabic academic books.
arXiv Detail & Related papers (2025-03-24T23:35:00Z)
English Please: Evaluating Machine Translation for Multilingual Bug Reports [0.0]
This study is the first comprehensive evaluation of machine translation (MT) performance on bug reports. We employ multiple machine translation metrics, including BLEU, BERTScore, COMET, METEOR, and ROUGE. DeepL consistently outperforms the other systems, demonstrating strong lexical and semantic alignment.
arXiv Detail & Related papers (2025-02-20T07:47:03Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [71.85120354973073]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems. Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs) We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
FactCheck Editor: Multilingual Text Editor with End-to-End fact-checking [1.985242455423935]
'FactCheck Editor' is an advanced text editor designed to automate fact-checking and correct factual inaccuracies. It supports over 90 languages and utilizes transformer models to assist humans in the labor-intensive process of fact verification.
arXiv Detail & Related papers (2024-04-30T11:55:20Z)
Cross-lingual Contextualized Phrase Retrieval [63.80154430930898]
We propose a new task formulation of dense retrieval, cross-lingual contextualized phrase retrieval. We train our Cross-lingual Contextualized Phrase Retriever (CCPR) using contrastive learning. On the phrase retrieval task, CCPR surpasses baselines by a significant margin, achieving a top-1 accuracy that is at least 13 points higher.
arXiv Detail & Related papers (2024-03-25T14:46:51Z)
MaintainoMATE: A GitHub App for Intelligent Automation of Maintenance Activities [3.2228025627337864]
Software development projects rely on issue tracking systems at the core of tracking maintenance tasks such as bug reports, and enhancement requests. The handling of issue-reports is critical and requires thorough scanning of the text entered in an issue-report making it a labor-intensive task. We present a unified framework called MaintainoMATE, which is capable of automatically categorizing the issue-reports in their respective category and further assigning the issue-reports to a developer with relevant expertise.
arXiv Detail & Related papers (2023-08-31T05:15:42Z)
A Comparative Study of Text Embedding Models for Semantic Text Similarity in Bug Reports [0.0]
Retrieving similar bug reports from an existing database can help reduce the time and effort required to resolve bugs. We explored several embedding models such as TF-IDF (Baseline), FastText, Gensim, BERT, and ADA. Our study provides insights into the effectiveness of different embedding methods for retrieving similar bug reports and highlights the impact of selecting the appropriate one for this task.
arXiv Detail & Related papers (2023-08-17T21:36:56Z)
Automatic Classification of Bug Reports Based on Multiple Text Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports. The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered. Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z)
Automatic Issue Classifier: A Transfer Learning Framework for Classifying Issue Reports [0.0]
We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports. This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
arXiv Detail & Related papers (2022-02-12T21:43:08Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint) It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis. TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z)
Conditioned Text Generation with Transfer for Closed-Domain Dialogue Systems [65.48663492703557]
We show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. We introduce a new protocol called query transfer that allows to leverage a large unlabelled dataset.
arXiv Detail & Related papers (2020-11-03T14:06:10Z)
Automatic Extraction of Rules Governing Morphological Agreement [103.78033184221373]
We develop an automated framework for extracting a first-pass grammatical specification from raw text. We focus on extracting rules describing agreement, a morphosyntactic phenomenon at the core of the grammars of many of the world's languages. We apply our framework to all languages included in the Universal Dependencies project, with promising results.
arXiv Detail & Related papers (2020-10-02T18:31:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.