Solsmith: Solidity Random Program Generator for Compiler Testing
- URL: http://arxiv.org/abs/2506.03909v1
- Date: Wed, 04 Jun 2025 13:04:17 GMT
- Title: Solsmith: Solidity Random Program Generator for Compiler Testing
- Authors: Lantian Li, Zhihao Liu, Zhongxing Yu,
- Abstract summary: This paper designs and implements Solsmith, a test program generator aimed at uncovering defects in Solidity compilers.<n>It tests the compiler correctness by generating valid and diverse Solidity programs.<n>Preliminary results show that Solsmith can generate the expected test programs and uncover four confirmed defects in Solidity compilers.
- Score: 8.14179966625145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Smart contracts are computer programs that run on blockchain platforms, with Solidity being the most widely used language for their development. As blockchain technology advances, smart contracts have become increasingly important across various fields. In order for smart contracts to operate correctly, the correctness of the compiler is particularly crucial. Although some research efforts have been devoted to testing Solidity compilers, they primarily focus on testing methods and do not address the core issue of generating test programs. To fill this gap, this paper designs and implements Solsmith, a test program generator specifically aimed at uncovering defects in Solidity compilers. It tests the compiler correctness by generating valid and diverse Solidity programs. We have designed a series of unique program generation strategies tailored to Solidity, including enabling optimizations more frequently, avoiding undefined behaviour, and mitigating behavioural differences caused by intermediate representations. To validate the effectiveness of Solsmith, we assess the effectiveness of the test programs generated by Solsmith using the approach of differential testing. The preliminary results show that Solsmith can generate the expected test programs and uncover four confirmed defects in Solidity compilers, demonstrating the effectiveness and potential of Solsmith.
Related papers
- Compiler Optimization Testing Based on Optimization-Guided Equivalence Transformations [3.2987550056134873]
We propose a metamorphic testing approach inspired by compiler optimizations.<n>Our approach first employs tailored code construction strategies to generate input programs that satisfy optimization conditions.<n>By comparing the outputs of pre- and post-transformation programs, this approach effectively identifies incorrect optimization bugs.
arXiv Detail & Related papers (2025-04-06T01:37:57Z) - SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models.<n>We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts.<n>Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z) - Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks.<n>However, improvement is plateauing due to the exhaustion of readily available high-quality data.<n>We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z) - Finding Missed Code Size Optimizations in Compilers using LLMs [1.90019787465083]
We develop a novel testing approach which combines large language models with a series of differential testing strategies.<n>Our approach requires fewer than 150 lines of code to implement.<n>To date we have reported 24 confirmed bugs in production compilers.
arXiv Detail & Related papers (2024-12-31T21:47:46Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - Towards Understanding the Bugs in Solidity Compiler [11.193701473232851]
This paper presents the first systematic study on 533 Solidity compiler bugs.
We examine their characteristics (including symptoms, root causes, and distribution) and their triggering test cases.
To study the limitations of Solidity compiler fuzzers, we evaluate three Solidity compiler fuzzers.
arXiv Detail & Related papers (2024-07-08T14:22:50Z) - Evolutionary Generative Fuzzing for Differential Testing of the Kotlin
Compiler [14.259471945857431]
We investigate the effectiveness of differential testing in finding bugs within the Kotlin compilers developed at JetBrains.
We propose a black-box generative approach that creates input programs for the K1 and K2 compilers.
Our case study shows that the proposed approach effectively detects bugs in K1 and K2; these bugs have been confirmed and (some) fixed by JetBrains developers.
arXiv Detail & Related papers (2024-01-12T16:01:12Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - Effective Random Test Generation for Deep Learning Compilers [16.065653480978092]
Isra is a domain-specific constraint solver that resolves the constraints from the semantic specifications without backtracking.<n>We implement and apply our approach to three popular real-world deep learning compilers including TVM, Glow, and a commercial compiler named SophGo.<n>Isra is more effective than the state-of-the-art approaches and the baseline approaches on constructing valid test inputs for compiler-bug detection.
arXiv Detail & Related papers (2023-02-02T03:00:36Z) - Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation.
Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges.
Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.