An Empirical Study of AI-based Smart Contract Creation
- URL: http://arxiv.org/abs/2308.02955v2
- Date: Sat, 19 Aug 2023 19:29:42 GMT
- Title: An Empirical Study of AI-based Smart Contract Creation
- Authors: Rabimba Karanjai, Edward Li, Lei Xu, Weidong Shi
- Abstract summary: Large language models (LLMs) like ChatGPT and Google Palm2 for smart contract generation seem to be the first well-established instance of an AI pair programmer.
The main objective of this study is to assess the quality of generated code provided by LLMs for smart contracts.
- Score: 4.801455786801489
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The introduction of large language models (LLMs) like ChatGPT and Google
Palm2 for smart contract generation seems to be the first well-established
instance of an AI pair programmer. LLMs have access to a large number of
open-source smart contracts, enabling them to utilize more extensive code in
Solidity than other code generation tools. Although the initial and informal
assessments of LLMs for smart contract generation are promising, a systematic
evaluation is needed to explore the limits and benefits of these models. The
main objective of this study is to assess the quality of generated code
provided by LLMs for smart contracts. We also aim to evaluate the impact of the
quality and variety of input parameters fed to LLMs. To achieve this aim, we
created an experimental setup for evaluating the generated code in terms of
validity, correctness, and efficiency. Our study finds crucial evidence of
security bugs getting introduced in the generated smart contracts as well as
the overall quality and correctness of the code getting impacted. However, we
also identified the areas where it can be improved. The paper also proposes
several potential research directions to improve the process, quality and
safety of generated smart contract codes.
Related papers
- Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection [0.0]
We train and test machine learning algorithms to classify smart contract codes according to type in order to compare model performance.
Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities.
arXiv Detail & Related papers (2025-01-04T08:32:53Z) - Helping LLMs Improve Code Generation Using Feedback from Testing and Static Analysis [3.892345568697058]
Large Language Models (LLMs) are one of the most promising developments in the field of artificial intelligence.
Developers routinely ask LLMs to generate code snippets, increasing productivity but also introducing ownership, privacy, correctness, and security issues.
Previous work highlighted how code generated by commercial LLMs is often not safe, containing vulnerabilities, bugs, and code smells.
arXiv Detail & Related papers (2024-12-19T13:34:14Z) - OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems.
While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Efficacy of Various Large Language Models in Generating Smart Contracts [0.0]
This study analyzes the application of code-generating Large Language Models in the creation of Solidity smart contracts on the immutable.
We also discovered a novel way of generating smart contracts through prompting new strategies.
arXiv Detail & Related papers (2024-06-28T17:31:47Z) - Software Vulnerability and Functionality Assessment using LLMs [0.8057006406834466]
We investigate whether Large Language Models (LLMs) can aid with code reviews.
Our investigation focuses on two tasks that we argue are fundamental to good reviews.
arXiv Detail & Related papers (2024-03-13T11:29:13Z) - FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - PrAIoritize: Automated Early Prediction and Prioritization of Vulnerabilities in Smart Contracts [1.081463830315253]
Smart contracts are prone to numerous security threats due to undisclosed vulnerabilities and code weaknesses.
Efficient prioritization is crucial for smart contract security.
Our research aims to provide an automated approach, PrAIoritize, for prioritizing and predicting critical code weaknesses.
arXiv Detail & Related papers (2023-08-21T23:30:39Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.