Employing Deep Learning and Structured Information Retrieval to Answer
Clarification Questions on Bug Reports
- URL: http://arxiv.org/abs/2304.12494v3
- Date: Sat, 8 Jul 2023 12:16:53 GMT
- Title: Employing Deep Learning and Structured Information Retrieval to Answer
Clarification Questions on Bug Reports
- Authors: Usmi Mukherjee and Mohammad Masudur Rahman
- Abstract summary: We propose a novel approach that uses CodeT5 in combination with Lucene to recommend answers to follow-up questions.
We evaluate our recommended answers with the manually annotated answers using similarity metrics like Normalized Smooth BLEU Score, METEOR, Word Mover's Distance, and Semantic Similarity.
- Score: 3.462843004438096
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Software bug reports reported on bug-tracking systems often lack crucial
information for the developers to promptly resolve them, costing companies
billions of dollars. There has been significant research on effectively
eliciting information from bug reporters in bug tracking systems using
different templates that bug reporters need to use. However, the need for
asking follow-up questions persists. Recent studies propose techniques to
suggest these follow-up questions to help developers obtain the missing
details, but there has been little research on answering these follow up
questions, which are often unanswered. In this paper, we propose a novel
approach that uses CodeT5 in combination with Lucene, an information retrieval
technique that leverages the relevance of different bug reports, their
components, and follow-up questions to recommend answers. These top-performing
answers, along with their bug report, serve as additional context apart from
the deficient bug report to the deep learning model for generating an answer.
We evaluate our recommended answers with the manually annotated answers using
similarity metrics like Normalized Smooth BLEU Score, METEOR, Word Mover's
Distance, and Semantic Similarity. We achieve a BLEU Score of up to 34 and
Semantic Similarity of up to 64 which shows that the answers generated are
understandable and good according to Google's standard and can outperform
multiple baselines.
Related papers
- Open Domain Question Answering with Conflicting Contexts [55.739842087655774]
We find that as much as 25% of unambiguous, open domain questions can lead to conflicting contexts when retrieved using Google Search.
We ask our annotators to provide explanations for their selections of correct answers.
arXiv Detail & Related papers (2024-10-16T07:24:28Z) - I Could've Asked That: Reformulating Unanswerable Questions [89.93173151422636]
We evaluate open-source and proprietary models for reformulating unanswerable questions.
GPT-4 and Llama2-7B successfully reformulate questions only 26% and 12% of the time, respectively.
We publicly release the benchmark and the code to reproduce the experiments.
arXiv Detail & Related papers (2024-07-24T17:59:07Z) - Localizing and Mitigating Errors in Long-form Question Answering [79.63372684264921]
Long-form question answering (LFQA) aims to provide thorough and in-depth answers to complex questions, enhancing comprehension.
This work introduces HaluQuestQA, the first hallucination dataset with localized error annotations for human-written and model-generated LFQA answers.
arXiv Detail & Related papers (2024-07-16T17:23:16Z) - Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking.
In this paper, we have proposed a solution using a combination of NLP techniques.
It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z) - Explaining Software Bugs Leveraging Code Structures in Neural Machine
Translation [5.079750706023254]
Bugsplainer generates natural language explanations for software bugs by learning from a large corpus of bug-fix commits.
Our evaluation using three performance metrics shows that Bugsplainer can generate understandable and good explanations according to Google's standard.
We also conduct a developer study involving 20 participants where the explanations from Bugsplainer were found to be more accurate, more precise, more concise and more useful than the baselines.
arXiv Detail & Related papers (2022-12-08T22:19:45Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - Automatic Classification of Bug Reports Based on Multiple Text
Information and Reports' Intention [37.67372105858311]
This paper proposes a new automatic classification method for bug reports.
The innovation is that when categorizing bug reports, in addition to using the text information of the report, the intention of the report is also considered.
Our proposed method achieves better performance and its F-Measure achieves from 87.3% to 95.5%.
arXiv Detail & Related papers (2022-08-02T06:44:51Z) - A Dataset of Information-Seeking Questions and Answers Anchored in
Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers.
Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text.
We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z) - Attention-based model for predicting question relatedness on Stack
Overflow [0.0]
We propose an Attention-based Sentence pair Interaction Model (ASIM) to predict the relatedness between questions on Stack Overflow automatically.
ASIM has made significant improvement over the baseline approaches in Precision, Recall, and Micro-F1 evaluation metrics.
Our model also performs well in the duplicate question detection task of Ask Ubuntu.
arXiv Detail & Related papers (2021-03-19T12:18:03Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.