Are We Ready to Embrace Generative AI for Software Q&A?
- URL: http://arxiv.org/abs/2307.09765v2
- Date: Sat, 12 Aug 2023 13:10:02 GMT
- Title: Are We Ready to Embrace Generative AI for Software Q&A?
- Authors: Bowen Xu, Thanh-Dat Nguyen, Thanh Le-Cong, Thong Hoang, Jiakun Liu,
Kisub Kim, Chen Gong, Changan Niu, Chenyu Wang, Bach Le, David Lo
- Abstract summary: Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques.
ChatGPT is banned by Stack Overflow after only 6 days from its release.
To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers.
- Score: 25.749110480727765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stack Overflow, the world's largest software Q&A (SQA) website, is facing a
significant traffic drop due to the emergence of generative AI techniques.
ChatGPT is banned by Stack Overflow after only 6 days from its release. The
main reason provided by the official Stack Overflow is that the answers
generated by ChatGPT are of low quality. To verify this, we conduct a
comparative evaluation of human-written and ChatGPT-generated answers. Our
methodology employs both automatic comparison and a manual study. Our results
suggest that human-written and ChatGPT-generated answers are semantically
similar, however, human-written answers outperform ChatGPT-generated ones
consistently across multiple aspects, specifically by 10% on the overall score.
We release the data, analysis scripts, and detailed results at
https://anonymous.4open.science/r/GAI4SQA-FD5C.
Related papers
- An exploratory analysis of Community-based Question-Answering Platforms and GPT-3-driven Generative AI: Is it the end of online community-based learning? [0.6749750044497732]
ChatGPT offers software engineers an interactive alternative to community question-answering platforms like Stack Overflow.
We analyze 2564 Python and JavaScript questions from StackOverflow that were asked between January 2022 and December 2022.
Our analysis indicates that ChatGPT's responses are 66% shorter and share 35% more words with the questions, showing a 25% increase in positive sentiment compared to human responses.
arXiv Detail & Related papers (2024-09-26T02:17:30Z) - Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer.
We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z) - An empirical study of ChatGPT-3.5 on question answering and code
maintenance [14.028497274245227]
A rising concern is whether ChatGPT will replace programmers and kill jobs.
We conducted an empirical study to systematically compare ChatGPT against programmers in question-answering and software-maintaining.
arXiv Detail & Related papers (2023-10-03T14:48:32Z) - From Mundane to Meaningful: AI's Influence on Work Dynamics -- evidence
from ChatGPT and Stack Overflow [0.0]
We explore how ChatGPT changed a fundamental aspect of coding: problem-solving.
We exploit the effect of the sudden release of ChatGPT on the 30th of November 2022 on the usage of the largest online community for coders: Stack Overflow.
arXiv Detail & Related papers (2023-08-22T09:30:02Z) - Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of
ChatGPT Answers to Stack Overflow Questions [7.065853028825656]
We conducted the first in-depth analysis of ChatGPT answers to programming questions on Stack Overflow.
We examined the correctness, consistency, comprehensiveness, and conciseness of ChatGPT answers.
Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose.
arXiv Detail & Related papers (2023-08-04T13:23:20Z) - Evaluating Privacy Questions From Stack Overflow: Can ChatGPT Compete? [1.231476564107544]
ChatGPT has been used as an alternative to generate code or produce responses to developers' questions.
Our results show that most privacy-related questions are related to choice/consent, aggregation, and identification.
arXiv Detail & Related papers (2023-06-19T21:33:04Z) - Chatbots put to the test in math and logic problems: A preliminary
comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard [68.8204255655161]
We use 30 questions that are clear, without any ambiguities, fully described with plain text only, and have a unique, well defined correct answer.
The answers are recorded and discussed, highlighting their strengths and weaknesses.
It was found that ChatGPT-4 outperforms ChatGPT-3.5 in both sets of questions.
arXiv Detail & Related papers (2023-05-30T11:18:05Z) - One Small Step for Generative AI, One Giant Leap for AGI: A Complete
Survey on ChatGPT in AIGC Era [95.2284704286191]
GPT-4 (a.k.a. ChatGPT plus) is one small step for generative AI (GAI) but one giant leap for artificial general intelligence (AGI)
Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage.
This work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges.
arXiv Detail & Related papers (2023-04-04T06:22:09Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
GPT-5 All You Need? [112.12974778019304]
generative AI (AIGC, a.k.a AI-generated content) has made headlines everywhere because of its ability to analyze and create text, images, and beyond.
In the era of AI transitioning from pure analysis to creation, it is worth noting that ChatGPT, with its most recent language model GPT-4, is just a tool out of numerous AIGC tasks.
This work focuses on the technological development of various AIGC tasks based on their output type, including text, images, videos, 3D content, etc.
arXiv Detail & Related papers (2023-03-21T10:09:47Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.