Related papers: Automatic Issue Classifier: A Transfer Learning Framework for Classifying Issue Reports

Automatic Issue Classifier: A Transfer Learning Framework for Classifying Issue Reports

URL: http://arxiv.org/abs/2202.06149v1
Date: Sat, 12 Feb 2022 21:43:08 GMT
Title: Automatic Issue Classifier: A Transfer Learning Framework for Classifying Issue Reports
Authors: Anas Nadeem, Muhammad Usman Sarwar and Muhammad Zubair Malik
Abstract summary: We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports. This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Issue tracking systems are used in the software industry for the facilitation of maintenance activities that keep the software robust and up to date with ever-changing industry requirements. Usually, users report issues that can be categorized into different labels such as bug reports, enhancement requests, and questions related to the software. Most of the issue tracking systems make the labelling of these issue reports optional for the issue submitter, which leads to a large number of unlabeled issue reports. In this paper, we present a state-of-the-art method to classify the issue reports into their respective categories i.e. bug, enhancement, and question. This is a challenging task because of the common use of informal language in the issue reports. Existing studies use traditional natural language processing approaches adopting key-word based features, which fail to incorporate the contextual relationship between words and therefore result in a high rate of false positives and false negatives. Moreover, previous works utilize a uni-label approach to classify the issue reports however, in reality, an issue-submitter can tag one issue report with more than one label at a time. This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and fine-tune it to classify the issue reports. We validate our approach on issue reports belonging to numerous industrial projects from GitHub. We were able to achieve promising F-1 scores of 81%, 74%, and 80% for bug reports, enhancements, and questions, respectively. We also develop an industry tool called Automatic Issue Classifier (AIC), which automatically assigns labels to newly reported issues on GitHub repositories with high accuracy.

Related papers

Automated Duplicate Bug Report Detection in Large Open Bug Repositories [3.481985817302898]
Many users and contributors of large open-source projects report software defects or enhancement requests (known as bug reports) to the issue-tracking systems. We propose a novel approach based on machine learning methods that can automatically detect duplicate bug reports in an open bug repository.
arXiv Detail & Related papers (2025-04-21T01:55:54Z)
An Empirical Study on the Classification of Bug Reports with Machine Learning [1.1499574149885023]
We study how different factors (e.g., project language, report content) can influence the performance of models in handling classification of issue reports. Using the report title or description does not significantly differ; Support Vector Machine, Logistic Regression, and Random Forest are effective in classifying issue reports. Models based on heterogeneous projects can classify reports from projects not present during training.
arXiv Detail & Related papers (2025-03-01T23:19:56Z)
Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints. In this paper, we study the multi-label classification problem under the federated learning setting. We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z)
Issue Report Validation in an Industrial Context [1.993607565985189]
We work on 1,200 randomly selected issue reports in banking domain, written in Turkish. We manually label these reports for validity, and extract the relevant patterns indicating that they are invalid. Using the proposed feature extractors, we utilize a machine learning based approach to predict the issue reports' validity, performing a 0.77 F1-score.
arXiv Detail & Related papers (2023-11-29T14:24:13Z)
MaintainoMATE: A GitHub App for Intelligent Automation of Maintenance Activities [3.2228025627337864]
Software development projects rely on issue tracking systems at the core of tracking maintenance tasks such as bug reports, and enhancement requests. The handling of issue-reports is critical and requires thorough scanning of the text entered in an issue-report making it a labor-intensive task. We present a unified framework called MaintainoMATE, which is capable of automatically categorizing the issue-reports in their respective category and further assigning the issue-reports to a developer with relevant expertise.
arXiv Detail & Related papers (2023-08-31T05:15:42Z)
Description-Enhanced Label Embedding Contrastive Learning for Text Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task. Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets. external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z)
Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking. In this paper, we have proposed a solution using a combination of NLP techniques. It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z)
Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers. We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z)
Automatic Classification of Bug Reports Based on Multiple Text Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports. The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered. Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z)
Predicting Issue Types on GitHub [8.791809365994682]
Ticket Tagger is a GitHub app analyzing the issue title and description through machine learning techniques. We empirically evaluated the tool's prediction performance on about 30,000 GitHub issues.
arXiv Detail & Related papers (2021-07-21T08:14:48Z)
S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning. It is based on a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network [61.94394163309688]
We propose a Label-enhanced Task-Adaptive Projection Network (L-TapNet) based on the state-of-the-art few-shot classification model -- TapNet. Experimental results show that our model significantly outperforms the strongest few-shot learning baseline by 14.64 F1 scores in the one-shot setting.
arXiv Detail & Related papers (2020-06-10T07:50:44Z)
Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models. By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.