Can I Solve It? Identifying APIs Required to Complete OSS Task
- URL: http://arxiv.org/abs/2103.12653v1
- Date: Tue, 23 Mar 2021 16:16:09 GMT
- Title: Can I Solve It? Identifying APIs Required to Complete OSS Task
- Authors: Fabio Santos, Igor Wiese, Bianca Trinkenreich, Igor Steinmacher, Anita
Sarma and Marco Gerosa
- Abstract summary: We investigate the feasibility and relevance of labeling issues with the domain of the APIs required to complete the tasks.
We leverage the issues' description and the project history to build prediction models, which resulted in precision up to 82% and recall up to 97.8%.
Our results can inspire the creation of tools to automatically label issues, helping developers to find tasks that better match their skills.
- Score: 16.13269535068818
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open Source Software projects add labels to open issues to help contributors
choose tasks. However, manually labeling issues is time-consuming and
error-prone. Current automatic approaches for creating labels are mostly
limited to classifying issues as a bug/non-bug. In this paper, we investigate
the feasibility and relevance of labeling issues with the domain of the APIs
required to complete the tasks. We leverage the issues' description and the
project history to build prediction models, which resulted in precision up to
82% and recall up to 97.8%. We also ran a user study (n=74) to assess these
labels' relevancy to potential contributors. The results show that the labels
were useful to participants in choosing tasks, and the API-domain labels were
selected more often than the existing architecture-based labels. Our results
can inspire the creation of tools to automatically label issues, helping
developers to find tasks that better match their skills.
Related papers
- Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Tag that issue: Applying API-domain labels in issue tracking systems [20.701637107734996]
Labeling issues with the skills required to complete them can help contributors to choose tasks in Open Source Software projects.
We investigate the feasibility and relevance of automatically labeling issues with what we call "API-domains," which are high-level categories of APIs.
Our results show that newcomers consider API-domain labels useful in choosing tasks, (ii) labels can be predicted with a precision of 84% and a recall of 78.6% on average, (iii) the results of the predictions reached up to 71.3% in precision and 52.5% in recall when training with a project and testing in another, and (iv) project
arXiv Detail & Related papers (2023-04-06T05:49:46Z) - ScarceNet: Animal Pose Estimation with Scarce Annotations [74.48263583706712]
ScarceNet is a pseudo label-based approach to generate artificial labels for the unlabeled images.
We evaluate our approach on the challenging AP-10K dataset, where our approach outperforms existing semi-supervised approaches by a large margin.
arXiv Detail & Related papers (2023-03-27T09:15:53Z) - GiveMeLabeledIssues: An Open Source Issue Recommendation System [9.312780130838952]
Developers often struggle to navigate an Open Source Software (OSS) project's issue-tracking system and find a suitable task.
This paper presents a tool that mines project repositories and labels issues based on the skills required to solve them.
GiveMeLabeledIssues facilitates matching developers' skills to tasks, reducing the burden on project maintainers.
arXiv Detail & Related papers (2023-03-23T16:39:31Z) - Supporting the Task-driven Skill Identification in Open Source Project
Issue Tracking Systems [0.0]
We investigate the automatic labeling of open issues strategy to help the contributors to pick a task to contribute.
By identifying the skills, we claim the contributor candidates should pick a task more suitable.
We applied quantitative studies to analyze the relevance of the labels in an experiment and compare the strategies' relative importance.
arXiv Detail & Related papers (2022-11-02T14:17:22Z) - Large Loss Matters in Weakly Supervised Multi-Label Classification [50.262533546999045]
We first regard unobserved labels as negative labels, casting the W task into noisy multi-label classification.
We propose novel methods for W which reject or correct the large loss samples to prevent model from memorizing the noisy label.
Our methodology actually works well, validating that treating large loss properly matters in a weakly supervised multi-label classification.
arXiv Detail & Related papers (2022-06-08T08:30:24Z) - Automatic Issue Classifier: A Transfer Learning Framework for
Classifying Issue Reports [0.0]
We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
This paper presents our approach to classify the issue reports in a multi-label setting. We use an off-the-shelf neural network called RoBERTa and finetune it to classify the issue reports.
arXiv Detail & Related papers (2022-02-12T21:43:08Z) - Predicting Issue Types on GitHub [8.791809365994682]
Ticket Tagger is a GitHub app analyzing the issue title and description through machine learning techniques.
We empirically evaluated the tool's prediction performance on about 30,000 GitHub issues.
arXiv Detail & Related papers (2021-07-21T08:14:48Z) - A Study on the Autoregressive and non-Autoregressive Multi-label
Learning [77.11075863067131]
We propose a self-attention based variational encoder-model to extract the label-label and label-feature dependencies jointly.
Our model can therefore be used to predict all labels in parallel while still including both label-label and label-feature dependencies.
arXiv Detail & Related papers (2020-12-03T05:41:44Z) - Exploiting Context for Robustness to Label Noise in Active Learning [47.341705184013804]
We address the problems of how a system can identify which of the queried labels are wrong and how a multi-class active learning system can be adapted to minimize the negative impact of label noise.
We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available.
This is demonstrated in three different applications: scene classification, activity classification, and document classification.
arXiv Detail & Related papers (2020-10-18T18:59:44Z) - Learning to Purify Noisy Labels via Meta Soft Label Corrector [49.92310583232323]
Recent deep neural networks (DNNs) can easily overfit to biased training data with noisy labels.
Label correction strategy is commonly used to alleviate this issue.
We propose a meta-learning model which could estimate soft labels through meta-gradient descent step.
arXiv Detail & Related papers (2020-08-03T03:25:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.