Automatic Issue Classifier: A Transfer Learning Framework for
Classifying Issue Reports
- URL: http://arxiv.org/abs/2202.06149v1
- Date: Sat, 12 Feb 2022 21:43:08 GMT
- Title: Automatic Issue Classifier: A Transfer Learning Framework for
Classifying Issue Reports
- Authors: Anas Nadeem, Muhammad Usman Sarwar and Muhammad Zubair Malik
- Abstract summary: We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Issue tracking systems are used in the software industry for the facilitation
of maintenance activities that keep the software robust and up to date with
ever-changing industry requirements. Usually, users report issues that can be
categorized into different labels such as bug reports, enhancement requests,
and questions related to the software. Most of the issue tracking systems make
the labelling of these issue reports optional for the issue submitter, which
leads to a large number of unlabeled issue reports. In this paper, we present a
state-of-the-art method to classify the issue reports into their respective
categories i.e. bug, enhancement, and question. This is a challenging task
because of the common use of informal language in the issue reports. Existing
studies use traditional natural language processing approaches adopting
key-word based features, which fail to incorporate the contextual relationship
between words and therefore result in a high rate of false positives and false
negatives. Moreover, previous works utilize a uni-label approach to classify
the issue reports however, in reality, an issue-submitter can tag one issue
report with more than one label at a time. This paper presents our approach to
classify the issue reports in a multi-label setting. We use an off-the-shelf
neural network called RoBERTa and fine-tune it to classify the issue reports.
We validate our approach on issue reports belonging to numerous industrial
projects from GitHub. We were able to achieve promising F-1 scores of 81%, 74%,
and 80% for bug reports, enhancements, and questions, respectively. We also
develop an industry tool called Automatic Issue Classifier (AIC), which
automatically assigns labels to newly reported issues on GitHub repositories
with high accuracy.
Related papers
- Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Issue Report Validation in an Industrial Context [1.993607565985189]
We work on 1,200 randomly selected issue reports in banking domain, written in Turkish.
We manually label these reports for validity, and extract the relevant patterns indicating that they are invalid.
Using the proposed feature extractors, we utilize a machine learning based approach to predict the issue reports' validity, performing a 0.77 F1-score.
arXiv Detail & Related papers (2023-11-29T14:24:13Z) - MaintainoMATE: A GitHub App for Intelligent Automation of Maintenance
Activities [3.2228025627337864]
Software development projects rely on issue tracking systems at the core of tracking maintenance tasks such as bug reports, and enhancement requests.
The handling of issue-reports is critical and requires thorough scanning of the text entered in an issue-report making it a labor-intensive task.
We present a unified framework called MaintainoMATE, which is capable of automatically categorizing the issue-reports in their respective category and further assigning the issue-reports to a developer with relevant expertise.
arXiv Detail & Related papers (2023-08-31T05:15:42Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking.
In this paper, we have proposed a solution using a combination of NLP techniques.
It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - Automatic Classification of Bug Reports Based on Multiple Text
Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports.
The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered.
Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z) - Predicting Issue Types on GitHub [8.791809365994682]
Ticket Tagger is a GitHub app analyzing the issue title and description through machine learning techniques.
We empirically evaluated the tool's prediction performance on about 30,000 GitHub issues.
arXiv Detail & Related papers (2021-07-21T08:14:48Z) - S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning.
It is based on a biLSTM encoder and a fully-connected classifier to compute similarity.
Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z) - Few-shot Slot Tagging with Collapsed Dependency Transfer and
Label-enhanced Task-adaptive Projection Network [61.94394163309688]
We propose a Label-enhanced Task-Adaptive Projection Network (L-TapNet) based on the state-of-the-art few-shot classification model -- TapNet.
Experimental results show that our model significantly outperforms the strongest few-shot learning baseline by 14.64 F1 scores in the one-shot setting.
arXiv Detail & Related papers (2020-06-10T07:50:44Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.