Related papers: Are We Ready to Embrace Generative AI for Software Q&A?

Are We Ready to Embrace Generative AI for Software Q&A?

URL: http://arxiv.org/abs/2307.09765v2
Date: Sat, 12 Aug 2023 13:10:02 GMT
Title: Are We Ready to Embrace Generative AI for Software Q&A?
Authors: Bowen Xu, Thanh-Dat Nguyen, Thanh Le-Cong, Thong Hoang, Jiakun Liu, Kisub Kim, Chen Gong, Changan Niu, Chenyu Wang, Bach Le, David Lo
Abstract summary: Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers.
Score: 25.749110480727765
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers. Our methodology employs both automatic comparison and a manual study. Our results suggest that human-written and ChatGPT-generated answers are semantically similar, however, human-written answers outperform ChatGPT-generated ones consistently across multiple aspects, specifically by 10% on the overall score. We release the data, analysis scripts, and detailed results at https://anonymous.4open.science/r/GAI4SQA-FD5C.

Related papers

An exploratory analysis of Community-based Question-Answering Platforms and GPT-3-driven Generative AI: Is it the end of online community-based learning? [0.6749750044497732]
ChatGPT offers software engineers an interactive alternative to community question-answering platforms like Stack Overflow. We analyze 2564 Python and JavaScript questions from StackOverflow that were asked between January 2022 and December 2022. Our analysis indicates that ChatGPT's responses are 66% shorter and share 35% more words with the questions, showing a 25% increase in positive sentiment compared to human responses.
arXiv Detail & Related papers (2024-09-26T02:17:30Z)
Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer. We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z)
An empirical study of ChatGPT-3.5 on question answering and code maintenance [14.028497274245227]
A rising concern is whether ChatGPT will replace programmers and kill jobs. We conducted an empirical study to systematically compare ChatGPT against programmers in question-answering and software-maintaining.
arXiv Detail & Related papers (2023-10-03T14:48:32Z)
From Mundane to Meaningful: AI's Influence on Work Dynamics -- evidence from ChatGPT and Stack Overflow [0.0]
We explore how ChatGPT changed a fundamental aspect of coding: problem-solving. We exploit the effect of the sudden release of ChatGPT on the 30th of November 2022 on the usage of the largest online community for coders: Stack Overflow.
arXiv Detail & Related papers (2023-08-22T09:30:02Z)
Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of ChatGPT Answers to Stack Overflow Questions [7.065853028825656]
We conducted the first in-depth analysis of ChatGPT answers to programming questions on Stack Overflow. We examined the correctness, consistency, comprehensiveness, and conciseness of ChatGPT answers. Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose.
arXiv Detail & Related papers (2023-08-04T13:23:20Z)
Evaluating Privacy Questions From Stack Overflow: Can ChatGPT Compete? [1.231476564107544]
ChatGPT has been used as an alternative to generate code or produce responses to developers' questions. Our results show that most privacy-related questions are related to choice/consent, aggregation, and identification.
arXiv Detail & Related papers (2023-06-19T21:33:04Z)
Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard [68.8204255655161]
We use 30 questions that are clear, without any ambiguities, fully described with plain text only, and have a unique, well defined correct answer. The answers are recorded and discussed, highlighting their strengths and weaknesses. It was found that ChatGPT-4 outperforms ChatGPT-3.5 in both sets of questions.
arXiv Detail & Related papers (2023-05-30T11:18:05Z)
One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era [95.2284704286191]
GPT-4 (a.k.a. ChatGPT plus) is one small step for generative AI (GAI) but one giant leap for artificial general intelligence (AGI) Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage. This work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges.
arXiv Detail & Related papers (2023-04-04T06:22:09Z)
To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection. We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains. Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z)
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need? [112.12974778019304]
generative AI (AIGC, a.k.a AI-generated content) has made headlines everywhere because of its ability to analyze and create text, images, and beyond. In the era of AI transitioning from pure analysis to creation, it is worth noting that ChatGPT, with its most recent language model GPT-4, is just a tool out of numerous AIGC tasks. This work focuses on the technological development of various AIGC tasks based on their output type, including text, images, videos, 3D content, etc.
arXiv Detail & Related papers (2023-03-21T10:09:47Z)
Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot. Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community. It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.