Related papers: A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages

A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages

URL: http://arxiv.org/abs/2308.04477v1
Date: Tue, 8 Aug 2023 15:02:32 GMT
Title: A Comparative Study of Code Generation using ChatGPT 3.5 across 10 Programming Languages
Authors: Alessio Buscemi
Abstract summary: Large Language Models (LLMs) are advanced Artificial Intelligence (AI) systems that have undergone extensive training. This research investigates the coding proficiency of ChatGPT 3.5, a LLM released by OpenAI in November 2022. The skill of the model in creating code snippets is evaluated across 10 various programming languages and 4 different software domains.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are advanced Artificial Intelligence (AI) systems that have undergone extensive training using large datasets in order to understand and produce language that closely resembles that of humans. These models have reached a level of proficiency where they are capable of successfully completing university exams across several disciplines and generating functional code to handle novel problems. This research investigates the coding proficiency of ChatGPT 3.5, a LLM released by OpenAI in November 2022, which has gained significant recognition for its impressive text generating and code creation capabilities. The skill of the model in creating code snippets is evaluated across 10 various programming languages and 4 different software domains. Based on the findings derived from this research, major unexpected behaviors and limitations of the model have been identified. This study aims to identify potential areas for development and examine the ramifications of automated code generation on the evolution of programming languages and on the tech industry.

Related papers

Code Generation and Algorithmic Problem Solving Using Llama 3.1 405B [0.0]
Llama-driven code generation can translate natural language prompts into executable code across multiple programming languages. Llama can serve as a versatile tool for developers of all skill levels, improving productivity and efficiency in software development. The potential implications for education, industry, and the future of coding practices are also discussed.
arXiv Detail & Related papers (2024-09-26T13:29:20Z)
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation. Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs. CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language. Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs) We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods. In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z)
Automatic Generation of Programming Exercises and Code Explanations with Large Language Models [4.947560475228859]
OpenAI Codex is a recent large language model from the GPT-3 family for translating code into natural language. We explore the natural language generation capabilities of Codex in two different phases of the life of a programming exercise. We find the majority of this automatically generated content both novel and sensible, and in many cases ready to use as is.
arXiv Detail & Related papers (2022-06-03T11:00:43Z)
MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages [76.93265104421559]
We benchmark code generation from natural language commands extending beyond English. We annotated a total of 896 NL-code pairs in three languages: Spanish, Japanese, and Russian. While the difficulties vary across these three languages, all systems lag significantly behind their English counterparts.
arXiv Detail & Related papers (2022-03-16T04:21:50Z)
Competition-Level Code Generation with AlphaCode [74.87216298566942]
We introduce AlphaCode, a system for code generation that can create novel solutions to problems that require deeper reasoning. In simulated evaluations on recent programming competitions on the Codeforces platform, AlphaCode achieved on average a ranking of top 54.3%.
arXiv Detail & Related papers (2022-02-08T23:16:31Z)
Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks [11.10732802304274]
Project CodeNet consists of 14M code samples and about 500M lines of code in 55 different programming languages. Project CodeNet is not only unique in its scale, but also in the diversity of coding tasks it can help benchmark.
arXiv Detail & Related papers (2021-05-25T00:13:29Z)
Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation. Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges. Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches [0.0]
This paper compares different deep learning architectures to create and use language models based on programming code. We discuss each approach's different strengths and weaknesses and what gaps we find to evaluate the language models or apply them in a real programming context.
arXiv Detail & Related papers (2020-09-16T15:17:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.