Related papers: Towards Better Answers: Automated Stack Overflow Post Updating

Towards Better Answers: Automated Stack Overflow Post Updating

URL: http://arxiv.org/abs/2408.09095v1
Date: Sat, 17 Aug 2024 04:48:53 GMT
Title: Towards Better Answers: Automated Stack Overflow Post Updating
Authors: Yubo Mai, Zhipeng Gao, Haoye Wang, Tingting Bi, Xing Hu, Xin Xia, Jianling Sun,
Abstract summary: We introduce a novel framework, named Soup (Stack Overflow Updator for Post) for this task. Soup addresses two key tasks: Valid Comment-Edit Prediction (VCP) and Automatic Post Updating (APU)
Score: 11.85319691188159
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Utilizing code snippets on Stack Overflow (SO) is a common practice among developers for problem-solving. Although SO code snippets serve as valuable resources, it is important to acknowledge their imperfections, reusing problematic code snippets can lead to the introduction of suboptimal or buggy code into software projects. SO comments often point out weaknesses of a post and provide valuable insights to improve the quality of answers, while SO comments are usually missed and/or ignored, leaving these problematic code snippets untouched. In this work, we first investigate the task of automatic SO posts updating based on their associated comments. We introduce a novel framework, named Soup (Stack Overflow Updator for Post) for this task. Soup addresses two key tasks: Valid Comment-Edit Prediction (VCP) and Automatic Post Updating (APU). Extensive experimental results show the promising performance of our model over a set of benchmarks. Moreover, we also performed an in-the-wild evaluation on Stack Overflow, we submitted 50 edits generated by our approach to Stack Overflow posts and 21 of them have been verified and accepted by SO maintainers, further proving the practical value of Soup.

Related papers

Dense Retrievers Can Fail on Simple Queries: Revealing The Granularity Dilemma of Embeddings [78.05609552686053]
This work focuses on an observed limitation of text encoders: embeddings may not be able to recognize fine-grained entities or events within the semantics.<n>We introduce a new evaluation dataset in Chinese, named CapRetrieval, whose passages are image captions, and queries are phrases inquiring entities or events in various forms.<n>Zero-shot evaluation suggests that encoders may fail on these fine-grained matching, regardless of training sources or model sizes.
arXiv Detail & Related papers (2025-06-10T09:00:33Z)
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z)
GBM Returns the Best Prediction Performance among Regression Approaches: A Case Study of Stack Overflow Code Quality [2.5515299924109858]
We examined the variables that predict Stack Overflow (Java) code quality, and the regression approach that provides the best predictive power.<n>Longer Stack Overflow code tended to have more code violations, questions that were scored higher also attracted more views and the more answers that are added to questions on Stack Overflow the more errors were typically observed in the code that was provided.
arXiv Detail & Related papers (2025-05-15T07:04:17Z)
Code2API: A Tool for Generating Reusable APIs from Stack Overflow Code Snippets [14.130403020877848]
Code2API is a Google Chrome extension that uses Large Language Models (LLMs) to automatically perform APIzation of code snippets on Stack Overflow. The evaluation results show that Code2API significantly outperforms the rule-based approach by a large margin.
arXiv Detail & Related papers (2025-04-19T15:49:03Z)
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding [49.56049319037421]
KodCode is a synthetic dataset that addresses the persistent challenge of acquiring high-quality, verifiable training data. It comprises question-solution-test triplets that are systematically validated via a self-verification procedure. This pipeline yields a large-scale, robust and diverse coding dataset.
arXiv Detail & Related papers (2025-03-04T19:17:36Z)
AUTOGENICS: Automated Generation of Context-Aware Inline Comments for Code Snippets on Programming Q&A Sites Using LLM [1.971759811837406]
Inline comments in source code facilitate easy comprehension, reusability, and enhanced readability. Code snippets in answers on Q&A sites like Stack Overflow (SO) often lack comments because answerers volunteer their time and often skip comments or explanations due to time constraints. Given these challenges, we introduced AUTOGENICS, a tool designed to integrate with SO to generate effective inline comments for code snippets in SO answers exploiting large language models.
arXiv Detail & Related papers (2024-08-27T21:21:13Z)
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems [124.82815637571413]
We design a procedure to synthesize Haystacks of documents, ensuring that specific textitinsights repeat across documents. The "Summary of a Haystack" (SummHay) task then requires a system to process the Haystack and generate, given a query, a summary that identifies the relevant insights and precisely cites the source documents.
arXiv Detail & Related papers (2024-07-01T15:23:42Z)
Automatic Bi-modal Question Title Generation for Stack Overflow with Prompt Learning [10.76882347665857]
An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question body. We propose an approach SOTitle+ by considering bi-modal information (i.e., the code snippets and the problem descriptions) in the question body. Our corpus includes 179,119 high-quality question posts for six popular programming languages.
arXiv Detail & Related papers (2024-03-06T12:58:25Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process. It incorporates a similarity-based retriever and a pre-trained code language model. It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
Answer ranking in Community Question Answering: a deep learning approach [0.0]
This work tries to advance the state of the art on answer ranking for community Question Answering by proceeding with a deep learning approach. We created a large data set of questions and answers posted to the Stack Overflow website. We leveraged the natural language processing capabilities of dense embeddings and LSTM networks to produce a prediction for the accepted answer attribute.
arXiv Detail & Related papers (2022-10-16T18:47:41Z)
Interactive Code Generation via Test-Driven User-Intent Formalization [60.90035204567797]
Large language models (LLMs) produce code from informal natural language (NL) intent. It is hard to define a notion of correctness since natural language can be ambiguous and lacks a formal semantics. We describe a language-agnostic abstract algorithm and a concrete implementation TiCoder.
arXiv Detail & Related papers (2022-08-11T17:41:08Z)
Reputation Gaming in Stack Overflow [10.021057473471236]
This paper offers a comprehensive study of the reported types of reputation manipulation scenarios that might be exercised in Stack Overflow. We found four different types of reputation fraud scenarios, such as voting rings where communities form to upvote each other repeatedly on similar posts. We developed algorithms that enable platform managers to automatically identify these suspicious reputation gaming scenarios for review.
arXiv Detail & Related papers (2021-11-13T11:58:59Z)
Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning [69.42679922160684]
We propose feedback-weighted learning based on importance sampling to improve upon an initial supervised system using binary user feedback. Our work opens the prospect to exploit interactions with real users and improve conversational systems after deployment.
arXiv Detail & Related papers (2020-11-01T19:50:34Z)
Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning [56.771557756836906]
We present a novel method that automatically learns a retrieval model alternately with the programmer from weak supervision. Our system leads to state-of-the-art performance on a large-scale task for complex question answering over knowledge bases.
arXiv Detail & Related papers (2020-10-29T18:28:16Z)
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code [51.00904399653609]
In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code. We develop a deep-learning approach that learns to correlate a comment with code changes. We show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system.
arXiv Detail & Related papers (2020-10-04T16:49:28Z)
Improving Quality of a Post's Set of Answers in Stack Overflow [2.0625936401496237]
A large number of low-quality posts on Stack Overflow require improvement. We propose an approach to automate the identification process of such posts and boost their set of answers.
arXiv Detail & Related papers (2020-05-30T19:40:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.