An exploratory analysis of Community-based Question-Answering Platforms and GPT-3-driven Generative AI: Is it the end of online community-based learning?
- URL: http://arxiv.org/abs/2409.17473v2
- Date: Mon, 30 Sep 2024 12:38:50 GMT
- Title: An exploratory analysis of Community-based Question-Answering Platforms and GPT-3-driven Generative AI: Is it the end of online community-based learning?
- Authors: Mohammed Mehedi Hasan, Mahady Hasan, Mamun Bin Ibne Reaz, Jannat Un Nayeem Iqra,
- Abstract summary: ChatGPT offers software engineers an interactive alternative to community question-answering platforms like Stack Overflow.
We analyze 2564 Python and JavaScript questions from StackOverflow that were asked between January 2022 and December 2022.
Our analysis indicates that ChatGPT's responses are 66% shorter and share 35% more words with the questions, showing a 25% increase in positive sentiment compared to human responses.
- Score: 0.6749750044497732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Context: The advent of Large Language Model-driven tools like ChatGPT offers software engineers an interactive alternative to community question-answering (CQA) platforms like Stack Overflow. While Stack Overflow provides benefits from the accumulated crowd-sourced knowledge, it often suffers from unpleasant comments, reactions, and long waiting times. Objective: In this study, we assess the efficacy of ChatGPT in providing solutions to software engineering questions by analyzing its performance specifically against human solutions. Method: We empirically analyze 2564 Python and JavaScript questions from StackOverflow that were asked between January 2022 and December 2022. We parse the questions and answers from Stack Overflow, then collect the answers to the same questions from ChatGPT through API, and employ four textual and four cognitive metrics to compare the answers generated by ChatGPT with the accepted answers provided by human subject matter experts to find out the potential reasons for which future knowledge seekers may prefer ChatGPT over CQA platforms. We also measure the accuracy of the answers provided by ChatGPT. We also measure user interaction on StackOverflow over the past two years using three metrics to determine how ChatGPT affects it. Results: Our analysis indicates that ChatGPT's responses are 66% shorter and share 35% more words with the questions, showing a 25% increase in positive sentiment compared to human responses. ChatGPT's answers' accuracy rate is between 71 to 75%, with a variation in response characteristics between JavaScript and Python. Additionally, our findings suggest a recent 38% decrease in comment interactions on Stack Overflow, indicating a shift in community engagement patterns. A supplementary survey with 14 Python and JavaScript professionals validated these findings.
Related papers
- Exploring ChatGPT's Capabilities on Vulnerability Management [56.4403395100589]
We explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples.
One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports.
Our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions.
arXiv Detail & Related papers (2023-11-11T11:01:13Z) - Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer.
We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z) - An empirical study of ChatGPT-3.5 on question answering and code
maintenance [14.028497274245227]
A rising concern is whether ChatGPT will replace programmers and kill jobs.
We conducted an empirical study to systematically compare ChatGPT against programmers in question-answering and software-maintaining.
arXiv Detail & Related papers (2023-10-03T14:48:32Z) - Is Stack Overflow Obsolete? An Empirical Study of the Characteristics of
ChatGPT Answers to Stack Overflow Questions [7.065853028825656]
We conducted the first in-depth analysis of ChatGPT answers to programming questions on Stack Overflow.
We examined the correctness, consistency, comprehensiveness, and conciseness of ChatGPT answers.
Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose.
arXiv Detail & Related papers (2023-08-04T13:23:20Z) - Are We Ready to Embrace Generative AI for Software Q&A? [25.749110480727765]
Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques.
ChatGPT is banned by Stack Overflow after only 6 days from its release.
To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers.
arXiv Detail & Related papers (2023-07-19T05:54:43Z) - Evaluating Privacy Questions From Stack Overflow: Can ChatGPT Compete? [1.231476564107544]
ChatGPT has been used as an alternative to generate code or produce responses to developers' questions.
Our results show that most privacy-related questions are related to choice/consent, aggregation, and identification.
arXiv Detail & Related papers (2023-06-19T21:33:04Z) - Chatbots put to the test in math and logic problems: A preliminary
comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard [68.8204255655161]
We use 30 questions that are clear, without any ambiguities, fully described with plain text only, and have a unique, well defined correct answer.
The answers are recorded and discussed, highlighting their strengths and weaknesses.
It was found that ChatGPT-4 outperforms ChatGPT-3.5 in both sets of questions.
arXiv Detail & Related papers (2023-05-30T11:18:05Z) - ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time [54.18651663847874]
ChatGPT has achieved great success and can be considered to have acquired an infrastructural status.
Existing benchmarks encounter two challenges: (1) Disregard for periodical evaluation and (2) Lack of fine-grained features.
We construct ChatLog, an ever-updating dataset with large-scale records of diverse long-form ChatGPT responses for 21 NLP benchmarks from March, 2023 to now.
arXiv Detail & Related papers (2023-04-27T11:33:48Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - Is ChatGPT a General-Purpose Natural Language Processing Task Solver? [113.22611481694825]
Large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot.
Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community.
It is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot.
arXiv Detail & Related papers (2023-02-08T09:44:51Z) - Features that Predict the Acceptability of Java and JavaScript Answers
on Stack Overflow [5.332217496693262]
We studied the Stack Overflow dataset by analyzing questions and answers for the two most popular tags (Java and JavaScript)
Our findings reveal that the length of code in answers, reputation of users, similarity of the text between questions and answers, and the time lag between questions and answers have the highest predictive power for differentiating accepted and unaccepted answers.
arXiv Detail & Related papers (2021-01-08T03:09:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.