Related papers: Tag that issue: Applying API-domain labels in issue tracking systems

Tag that issue: Applying API-domain labels in issue tracking systems

URL: http://arxiv.org/abs/2304.02877v1
Date: Thu, 6 Apr 2023 05:49:46 GMT
Title: Tag that issue: Applying API-domain labels in issue tracking systems
Authors: Fabio Santos, Joseph Vargovich, Bianca Trinkenreich, Italo Santos, Jacob Penney, Ricardo Britto, Jo\~ao Felipe Pimentel, Igor Wiese, Igor Steinmacher, Anita Sarma, Marco A. Gerosa
Abstract summary: Labeling issues with the skills required to complete them can help contributors to choose tasks in Open Source Software projects. We investigate the feasibility and relevance of automatically labeling issues with what we call "API-domains," which are high-level categories of APIs. Our results show that newcomers consider API-domain labels useful in choosing tasks, (ii) labels can be predicted with a precision of 84% and a recall of 78.6% on average, (iii) the results of the predictions reached up to 71.3% in precision and 52.5% in recall when training with a project and testing in another, and (iv) project
Score: 20.701637107734996
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Labeling issues with the skills required to complete them can help contributors to choose tasks in Open Source Software projects. However, manually labeling issues is time-consuming and error-prone, and current automated approaches are mostly limited to classifying issues as bugs/non-bugs. We investigate the feasibility and relevance of automatically labeling issues with what we call "API-domains," which are high-level categories of APIs. Therefore, we posit that the APIs used in the source code affected by an issue can be a proxy for the type of skills (e.g., DB, security, UI) needed to work on the issue. We ran a user study (n=74) to assess API-domain labels' relevancy to potential contributors, leveraged the issues' descriptions and the project history to build prediction models, and validated the predictions with contributors (n=20) of the projects. Our results show that (i) newcomers to the project consider API-domain labels useful in choosing tasks, (ii) labels can be predicted with a precision of 84% and a recall of 78.6% on average, (iii) the results of the predictions reached up to 71.3% in precision and 52.5% in recall when training with a project and testing in another (transfer learning), and (iv) project contributors consider most of the predictions helpful in identifying needed skills. These findings suggest our approach can be applied in practice to automatically label issues, assisting developers in finding tasks that better match their skills.

Related papers

SkillScope: A Tool to Predict Fine-Grained Skills Needed to Solve Issues on GitHub [8.890715113245877]
We introduce a novel tool, SkillScope, which retrieves current issues from Java projects hosted on GitHub and predicts the multilevel programming skills required to resolve these issues. In a case study, we demonstrate that SkillScope could predict 217 multilevel skills for tasks with 91% precision, 88% recall, and 89% F-measure on average.
arXiv Detail & Related papers (2025-01-27T10:17:38Z)
Leveraging Large Language Models for Efficient Failure Analysis in Game Development [47.618236610219554]
This paper proposes a new approach to automatically identify which change in the code caused a test to fail. The method leverages Large Language Models (LLMs) to associate error messages with the corresponding code changes causing the failure. Our approach reaches an accuracy of 71% in our newly created dataset, which comprises issues reported by developers at EA over a period of one year.
arXiv Detail & Related papers (2024-06-11T09:21:50Z)
MaintainoMATE: A GitHub App for Intelligent Automation of Maintenance Activities [3.2228025627337864]
Software development projects rely on issue tracking systems at the core of tracking maintenance tasks such as bug reports, and enhancement requests. The handling of issue-reports is critical and requires thorough scanning of the text entered in an issue-report making it a labor-intensive task. We present a unified framework called MaintainoMATE, which is capable of automatically categorizing the issue-reports in their respective category and further assigning the issue-reports to a developer with relevant expertise.
arXiv Detail & Related papers (2023-08-31T05:15:42Z)
GiveMeLabeledIssues: An Open Source Issue Recommendation System [9.312780130838952]
Developers often struggle to navigate an Open Source Software (OSS) project's issue-tracking system and find a suitable task. This paper presents a tool that mines project repositories and labels issues based on the skills required to solve them. GiveMeLabeledIssues facilitates matching developers' skills to tasks, reducing the burden on project maintainers.
arXiv Detail & Related papers (2023-03-23T16:39:31Z)
Supporting the Task-driven Skill Identification in Open Source Project Issue Tracking Systems [0.0]
We investigate the automatic labeling of open issues strategy to help the contributors to pick a task to contribute. By identifying the skills, we claim the contributor candidates should pick a task more suitable. We applied quantitative studies to analyze the relevance of the labels in an experiment and compare the strategies' relative importance.
arXiv Detail & Related papers (2022-11-02T14:17:22Z)
Automatic Issue Classifier: A Transfer Learning Framework for Classifying Issue Reports [0.0]
We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports. This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
arXiv Detail & Related papers (2022-02-12T21:43:08Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)
Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes [78.23907801786827]
We introduce generalizations that ensure that our approach is both cost-aware and allows for fine-grained selection of examples through partially labeled scenes. Our experiments on a real-world, large-scale self-driving dataset suggest that fine-grained selection can improve the performance across perception, prediction, and downstream planning tasks.
arXiv Detail & Related papers (2021-04-08T17:57:41Z)
Can I Solve It? Identifying APIs Required to Complete OSS Task [16.13269535068818]
We investigate the feasibility and relevance of labeling issues with the domain of the APIs required to complete the tasks. We leverage the issues' description and the project history to build prediction models, which resulted in precision up to 82% and recall up to 97.8%. Our results can inspire the creation of tools to automatically label issues, helping developers to find tasks that better match their skills.
arXiv Detail & Related papers (2021-03-23T16:16:09Z)
Exploiting Context for Robustness to Label Noise in Active Learning [47.341705184013804]
We address the problems of how a system can identify which of the queried labels are wrong and how a multi-class active learning system can be adapted to minimize the negative impact of label noise. We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available. This is demonstrated in three different applications: scene classification, activity classification, and document classification.
arXiv Detail & Related papers (2020-10-18T18:59:44Z)
Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models. Self-training serves as an effective mechanism to learn from large amounts of unlabeled data. meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z)
Adversarial Knowledge Transfer from Unlabeled Data [62.97253639100014]
We present a novel Adversarial Knowledge Transfer framework for transferring knowledge from internet-scale unlabeled data to improve the performance of a classifier. An important novel aspect of our method is that the unlabeled source data can be of different classes from those of the labeled target data, and there is no need to define a separate pretext task.
arXiv Detail & Related papers (2020-08-13T08:04:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.