Related papers: The Influence of Code Comments on the Perceived Helpfulness of Stack Overflow Posts

The Influence of Code Comments on the Perceived Helpfulness of Stack Overflow Posts

URL: http://arxiv.org/abs/2508.19610v2
Date: Sat, 13 Sep 2025 19:13:44 GMT
Title: The Influence of Code Comments on the Perceived Helpfulness of Stack Overflow Posts
Authors: Kathrin Figl, Maria Kirchner, Sebastian Baltes, Michael Felderer,
Abstract summary: Question-and-answer platforms such as Stack Overflow are an important way for software developers to share and retrieve knowledge.<n>To better understand how code comments affect the perceived helpfulness of Stack Overflow answers, we conducted an online experiment simulating a Stack Overflow environment.<n>Results indicate that both block and inline comments are perceived as significantly more helpful than uncommented source code.
Score: 5.0211473911266085
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Question-and-answer platforms such as Stack Overflow are an important way for software developers to share and retrieve knowledge. However, reusing poorly understood code can lead to serious problems, such as bugs or security vulnerabilities. To better understand how code comments affect the perceived helpfulness of Stack Overflow answers, we conducted an online experiment simulating a Stack Overflow environment (n=91). The results indicate that both block and inline comments are perceived as significantly more helpful than uncommented source code. Moreover, novices rated code snippets with block comments as more helpful than those with inline comments. Interestingly, other surface features, such as the position of an answer and its answer score, were considered less important. Moreover, the content of Stack Overflow has been a major source for training large language models. AI-based coding assistants such as GitHub Copilot, which are based on these models, are changing the way Stack Overflow is used. However, our findings have implications beyond Stack Overflow. First, they may help to improve the relevance also of other community-driven platforms, which provide human advice and explanations of code solutions, complementing AI-based support for software developers. Second, since chat-based AI tools can be prompted to generate code in different ways, knowing which properties influence perceived helpfulness can lead to more targeted prompting strategies to generate readable code snippets.

Related papers

GBM Returns the Best Prediction Performance among Regression Approaches: A Case Study of Stack Overflow Code Quality [2.5515299924109858]
We examined the variables that predict Stack Overflow (Java) code quality, and the regression approach that provides the best predictive power.<n>Longer Stack Overflow code tended to have more code violations, questions that were scored higher also attracted more views and the more answers that are added to questions on Stack Overflow the more errors were typically observed in the code that was provided.
arXiv Detail & Related papers (2025-05-15T07:04:17Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub. 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z)
Towards Better Answers: Automated Stack Overflow Post Updating [11.85319691188159]
We introduce a novel framework, named Soup (Stack Overflow Updator for Post) for this task. Soup addresses two key tasks: Valid Comment-Edit Prediction (VCP) and Automatic Post Updating (APU)
arXiv Detail & Related papers (2024-08-17T04:48:53Z)
Investigating the Transferability of Code Repair for Low-Resource Programming Languages [57.62712191540067]
Large language models (LLMs) have shown remarkable performance on code generation tasks. Recent works augment the code repair process by integrating modern techniques such as chain-of-thought reasoning or distillation. We investigate the benefits of distilling code repair for both high and low resource languages.
arXiv Detail & Related papers (2024-06-21T05:05:39Z)
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy. Results indicate that MANGO significantly improves the code pass rate based on the strong baselines. The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z)
CONCORD: Clone-aware Contrastive Learning for Source Code [64.51161487524436]
Self-supervised pre-training has gained traction for learning generic code representations valuable for many downstream SE tasks. We argue that it is also essential to factor in how developers code day-to-day for general-purpose representation learning. In particular, we propose CONCORD, a self-supervised, contrastive learning strategy to place benign clones closer in the representation space while moving deviants further apart.
arXiv Detail & Related papers (2023-06-05T20:39:08Z)
Answer ranking in Community Question Answering: a deep learning approach [0.0]
This work tries to advance the state of the art on answer ranking for community Question Answering by proceeding with a deep learning approach. We created a large data set of questions and answers posted to the Stack Overflow website. We leveraged the natural language processing capabilities of dense embeddings and LSTM networks to produce a prediction for the accepted answer attribute.
arXiv Detail & Related papers (2022-10-16T18:47:41Z)
Features that Predict the Acceptability of Java and JavaScript Answers on Stack Overflow [5.332217496693262]
We studied the Stack Overflow dataset by analyzing questions and answers for the two most popular tags (Java and JavaScript) Our findings reveal that the length of code in answers, reputation of users, similarity of the text between questions and answers, and the time lag between questions and answers have the highest predictive power for differentiating accepted and unaccepted answers.
arXiv Detail & Related papers (2021-01-08T03:09:38Z)
Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision. Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z)
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code. We develop a deep-learning approach that learns to correlate a comment with code changes. We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)
Improving Quality of a Post's Set of Answers in Stack Overflow [2.0625936401496237]
A large number of low-quality posts on Stack Overflow require improvement. We propose an approach to automate the identification process of such posts and boost their set of answers.
arXiv Detail & Related papers (2020-05-30T19:40:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.