Related papers: ChatGPT, Can You Generate Solutions for my Coding Exercises? An Evaluation on its Effectiveness in an undergraduate Java Programming Course

ChatGPT, Can You Generate Solutions for my Coding Exercises? An Evaluation on its Effectiveness in an undergraduate Java Programming Course

URL: http://arxiv.org/abs/2305.13680v1
Date: Tue, 23 May 2023 04:38:37 GMT
Title: ChatGPT, Can You Generate Solutions for my Coding Exercises? An Evaluation on its Effectiveness in an undergraduate Java Programming Course
Authors: Eng Lieh Ouh, Benjamin Kok Siew Gan, Kyong Jin Shim, Swavek Wlodkowski
Abstract summary: ChatGPT is a large-scale, deep learning-driven natural language processing model. Our evaluation involves analyzing ChatGPT-generated solutions for 80 diverse programming exercises.
Score: 4.779196219827508
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this study, we assess the efficacy of employing the ChatGPT language model to generate solutions for coding exercises within an undergraduate Java programming course. ChatGPT, a large-scale, deep learning-driven natural language processing model, is capable of producing programming code based on textual input. Our evaluation involves analyzing ChatGPT-generated solutions for 80 diverse programming exercises and comparing them to the correct solutions. Our findings indicate that ChatGPT accurately generates Java programming solutions, which are characterized by high readability and well-structured organization. Additionally, the model can produce alternative, memory-efficient solutions. However, as a natural language processing model, ChatGPT struggles with coding exercises containing non-textual descriptions or class files, leading to invalid solutions. In conclusion, ChatGPT holds potential as a valuable tool for students seeking to overcome programming challenges and explore alternative approaches to solving coding problems. By understanding its limitations, educators can design coding exercises that minimize the potential for misuse as a cheating aid while maintaining their validity as assessment tools.

Related papers

Beyond the Hype: A Cautionary Tale of ChatGPT in the Programming Classroom [0.0]
The paper provides insights for academics who teach programming to create more challenging exercises and how to engage responsibly in the use of ChatGPT to promote classroom integrity. We analyzed the various practical programming examples from past IS exercises and compared those with memos created by tutors and lecturers in a university setting.
arXiv Detail & Related papers (2024-06-16T23:52:37Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs. CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language. Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
Kattis vs. ChatGPT: Assessment and Evaluation of Programming Tasks in the Age of Artificial Intelligence [0.0]
The effectiveness of using large language models for solving programming tasks has been underexplored. The present study examines ChatGPT's ability to generate code solutions at different difficulty levels for introductory programming courses. Results contribute to the ongoing debate on the utility of AI-powered tools in programming education.
arXiv Detail & Related papers (2023-12-02T11:09:17Z)
Large Language Models as Analogical Reasoners [155.9617224350088]
Chain-of-thought (CoT) prompting for language models demonstrates impressive performance across reasoning tasks. We introduce a new prompting approach, analogical prompting, designed to automatically guide the reasoning process of large language models.
arXiv Detail & Related papers (2023-10-03T00:57:26Z)
Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues [17.7880460531813]
We systematically study the quality of 4,066 ChatGPT-generated code implemented in two popular programming languages. We identify and characterize potential issues with the quality of ChatGPT-generated code. We find that ChatGPT can partially address these challenges, improving code quality by more than 20%, but there are still limitations and opportunities for improvement.
arXiv Detail & Related papers (2023-07-24T08:14:22Z)
Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structures [0.6990493129893112]
We evaluate ChatGPT's ability to generate correct solutions to the problems fed to it, its code quality, and nature of run-time errors thrown by its code. We look into patterns in the test cases passed in order to gain some insights into how wrong ChatGPT code is in these kinds of situations.
arXiv Detail & Related papers (2023-07-10T08:20:34Z)
Is ChatGPT the Ultimate Programming Assistant -- How far is it? [11.943927095071105]
ChatGPT has received great attention: it can be used as a bot for discussing source code. We present an empirical study of ChatGPT's potential as a fully automated programming assistant.
arXiv Detail & Related papers (2023-04-24T09:20:13Z)
ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP) This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources. Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z)
PanGu-Coder: Program Synthesis with Function-Level Language Modeling [47.63943623661298]
PanGu-Coder is a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling to pre-train on raw programming language data. The second stage uses a combination of Causal Language Modelling and Masked Language Modelling to train on loosely curated pairs of natural language program definitions and code functions.
arXiv Detail & Related papers (2022-07-22T18:08:16Z)
AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another. We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z)
Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation. Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges. Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.