Related papers: Features that Predict the Acceptability of Java and JavaScript Answers on Stack Overflow

Features that Predict the Acceptability of Java and JavaScript Answers on Stack Overflow

URL: http://arxiv.org/abs/2101.02830v2
Date: Mon, 19 Jun 2023 09:18:04 GMT
Title: Features that Predict the Acceptability of Java and JavaScript Answers on Stack Overflow
Authors: Osayande P. Omondiagbe, Sherlock A. Licorish and Stephen G. MacDonell
Abstract summary: We studied the Stack Overflow dataset by analyzing questions and answers for the two most popular tags (Java and JavaScript) Our findings reveal that the length of code in answers, reputation of users, similarity of the text between questions and answers, and the time lag between questions and answers have the highest predictive power for differentiating accepted and unaccepted answers.
Score: 5.332217496693262
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Context: Stack Overflow is a popular community question and answer portal used by practitioners to solve problems during software development. Developers can focus their attention on answers that have been accepted or where members have recorded high votes in judging good answers when searching for help. However, the latter mechanism (votes) can be unreliable, and there is currently no way to differentiate between an answer that is likely to be accepted and those that will not be accepted by looking at the answer's characteristics. Objective: In potentially providing a mechanism to identify acceptable answers, this study examines the features that distinguish an accepted answer from an unaccepted answer. Methods: We studied the Stack Overflow dataset by analyzing questions and answers for the two most popular tags (Java and JavaScript). Our dataset comprised 249,588 posts drawn from 2014-2016. We use random forest and neural network models to predict accepted answers, and study the features with the highest predictive power in those two models. Results: Our findings reveal that the length of code in answers, reputation of users, similarity of the text between questions and answers, and the time lag between questions and answers have the highest predictive power for differentiating accepted and unaccepted answers. Conclusion: Tools may leverage these findings in supporting developers and reducing the effort they must dedicate to searching for suitable answers on Stack Overflow.

Related papers

An exploratory analysis of Community-based Question-Answering Platforms and GPT-3-driven Generative AI: Is it the end of online community-based learning? [0.6749750044497732]
ChatGPT offers software engineers an interactive alternative to community question-answering platforms like Stack Overflow. We analyze 2564 Python and JavaScript questions from StackOverflow that were asked between January 2022 and December 2022. Our analysis indicates that ChatGPT's responses are 66% shorter and share 35% more words with the questions, showing a 25% increase in positive sentiment compared to human responses.
arXiv Detail & Related papers (2024-09-26T02:17:30Z)
Multimodal Reranking for Knowledge-Intensive Visual Question Answering [77.24401833951096]
We introduce a multi-modal reranker to improve the ranking quality of knowledge candidates for answer generation. Experiments on OK-VQA and A-OKVQA show that multi-modal reranker from distant supervision provides consistent improvements.
arXiv Detail & Related papers (2024-07-17T02:58:52Z)
Can We Identify Stack Overflow Questions Requiring Code Snippets? Investigating the Cause & Effect of Missing Code Snippets [8.107650447105998]
On the Stack Overflow (SO) Q&A site, users often request solutions to their code-related problems. They often miss required code snippets during their question submission. This study investigates the cause & effect of missing code snippets in SO questions whenever required.
arXiv Detail & Related papers (2024-02-07T04:25:31Z)
Answering Ambiguous Questions with a Database of Questions, Answers, and Revisions [95.92276099234344]
We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia. Our method improves performance by 15% on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs.
arXiv Detail & Related papers (2023-08-16T20:23:16Z)
Answering Ambiguous Questions via Iterative Prompting [84.3426020642704]
In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist. One approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity. We present AmbigPrompt to address the imperfections of existing approaches to answering ambiguous questions.
arXiv Detail & Related papers (2023-07-08T04:32:17Z)
Best-Answer Prediction in Q&A Sites Using User Information [2.982218441172364]
Community Question Answering (CQA) sites have spread and multiplied significantly in recent years. One practical way of finding such answers is automatically predicting the best candidate given existing answers and comments. We address this limitation using a novel method for predicting the best answers using the questioner's background information and other features.
arXiv Detail & Related papers (2022-12-15T02:28:52Z)
Answer ranking in Community Question Answering: a deep learning approach [0.0]
This work tries to advance the state of the art on answer ranking for community Question Answering by proceeding with a deep learning approach. We created a large data set of questions and answers posted to the Stack Overflow website. We leveraged the natural language processing capabilities of dense embeddings and LSTM networks to produce a prediction for the accepted answer attribute.
arXiv Detail & Related papers (2022-10-16T18:47:41Z)
Graph-Based Tri-Attention Network for Answer Ranking in CQA [56.42018099917321]
We propose a novel graph-based tri-attention network, namely GTAN, to generate answer ranking scores. Experiments on three real-world CQA datasets demonstrate GTAN significantly outperforms state-of-the-art answer ranking methods.
arXiv Detail & Related papers (2021-03-05T10:40:38Z)
Brain-inspired Search Engine Assistant based on Knowledge Graph [53.89429854626489]
DeveloperBot is a brain-inspired search engine assistant named on knowledge graph. It constructs a multi-layer query graph by splitting a complex multi-constraint query into several ordered constraints. It then models the constraint reasoning process as subgraph search process inspired by the spreading activation model of cognitive science.
arXiv Detail & Related papers (2020-12-25T06:36:11Z)
Open-Domain Question Answering with Pre-Constructed Question Spaces [70.13619499853756]
Open-domain question answering aims at solving the task of locating the answers to user-generated questions in massive collections of documents. There are two families of solutions available: retriever-readers, and knowledge-graph-based approaches. We propose a novel algorithm with a reader-retriever structure that differs from both families.
arXiv Detail & Related papers (2020-06-02T04:31:09Z)
Improving Quality of a Post's Set of Answers in Stack Overflow [2.0625936401496237]
A large number of low-quality posts on Stack Overflow require improvement. We propose an approach to automate the identification process of such posts and boost their set of answers.
arXiv Detail & Related papers (2020-05-30T19:40:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.