Multifaceted Hierarchical Report Identification for Non-Functional Bugs
in Deep Learning Frameworks
- URL: http://arxiv.org/abs/2210.01855v1
- Date: Tue, 4 Oct 2022 18:49:37 GMT
- Title: Multifaceted Hierarchical Report Identification for Non-Functional Bugs
in Deep Learning Frameworks
- Authors: Guoming Long, Tao Chen, Georgina Cosma
- Abstract summary: We propose MHNurf - an end-to-end tool for automatically identifying non-functional bug related reports in Deep Learning (DL) frameworks.
The core of MHNurf is a Multifaceted Hierarchical Attention Network (MHAN) that tackles three unaddressed challenges.
MHNurf works the best with a combination of content, comment, and code, which considerably outperforms the classic HAN where only the content is used.
- Score: 5.255197438986675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-functional bugs (e.g., performance- or accuracy-related bugs) in Deep
Learning (DL) frameworks can lead to some of the most devastating consequences.
Reporting those bugs on a repository such as GitHub is a standard route to fix
them. Yet, given the growing number of new GitHub reports for DL frameworks, it
is intrinsically difficult for developers to distinguish those that reveal
non-functional bugs among the others, and assign them to the right contributor
for investigation in a timely manner. In this paper, we propose MHNurf - an
end-to-end tool for automatically identifying non-functional bug related
reports in DL frameworks. The core of MHNurf is a Multifaceted Hierarchical
Attention Network (MHAN) that tackles three unaddressed challenges: (1)
learning the semantic knowledge, but doing so by (2) considering the hierarchy
(e.g., words/tokens in sentences/statements) and focusing on the important
parts (i.e., words, tokens, sentences, and statements) of a GitHub report,
while (3) independently extracting information from different types of
features, i.e., content, comment, code, command, and label.
To evaluate MHNurf, we leverage 3,721 GitHub reports from five DL frameworks
for conducting experiments. The results show that MHNurf works the best with a
combination of content, comment, and code, which considerably outperforms the
classic HAN where only the content is used. MHNurf also produces significantly
more accurate results than nine other state-of-the-art classifiers with strong
statistical significance, i.e., up to 71% AUC improvement and has the best
Scott-Knott rank on four frameworks while 2nd on the remaining one. To
facilitate reproduction and promote future research, we have made our dataset,
code, and detailed supplementary results publicly available at:
https://github.com/ideas-labo/APSEC2022-MHNurf.
Related papers
- Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub.
83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z) - Pink: Unveiling the Power of Referential Comprehension for Multi-modal
LLMs [49.88461345825586]
This paper proposes a new framework to enhance the fine-grained image understanding abilities of MLLMs.
We present a new method for constructing the instruction tuning dataset at a low cost by leveraging annotations in existing datasets.
We show that our model exhibits a 5.2% accuracy improvement over Qwen-VL and surpasses the accuracy of Kosmos-2 by 24.7%.
arXiv Detail & Related papers (2023-10-01T05:53:15Z) - Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking.
In this paper, we have proposed a solution using a combination of NLP techniques.
It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z) - Explaining Software Bugs Leveraging Code Structures in Neural Machine
Translation [5.079750706023254]
Bugsplainer generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits.
Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard.
We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.
arXiv Detail & Related papers (2022-12-08T22:19:45Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - Automatic Classification of Bug Reports Based on Multiple Text
Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports.
The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered.
Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z) - DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem.
The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network.
To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z) - Neural Code Summarization: How Far Are We? [30.324396716447602]
Deep learning techniques have been exploited to automatically generate summaries for given code snippets.
In this paper, we conduct a systematic and in-depth analysis of five state-of-the-art neural source code summarization models.
arXiv Detail & Related papers (2021-07-15T04:33:59Z) - Scarecrow: A Framework for Scrutinizing Machine Text [69.26985439191151]
We introduce a new structured, crowdsourced error annotation schema called Scarecrow.
Scarecrow collects 13k annotations of 1.3k human and machine generate paragraphs of English language news text.
These findings demonstrate the value of Scarecrow annotations in the assessment of current and future text generation systems.
arXiv Detail & Related papers (2021-07-02T22:37:03Z) - S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning.
It is based on a biLSTM encoder and a fully-connected classifier to compute similarity.
Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z) - Advaita: Bug Duplicity Detection System [1.9624064951902522]
Duplicate bugs rate (% of duplicate bugs) are in the range from single digit (1 to 9%) to double digits (40%) based on the product maturity, size of the code and number of engineers working on the project.
Detecting duplicity deals with identifying whether any two bugs convey the same meaning.
This approach considers multiple sets of features viz. basic text statistical features, semantic features and contextual features.
arXiv Detail & Related papers (2020-01-24T04:48:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.