Related papers: CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing

CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing

URL: http://arxiv.org/abs/2403.13583v2
Date: Mon, 1 Jul 2024 09:59:47 GMT
Title: CoCoST: Automatic Complex Code Generation with Online Searching and Correctness Testing
Authors: Xinyi He, Jiaru Zou, Yun Lin, Mengyu Zhou, Shi Han, Zejian Yuan, Dongmei Zhang,
Abstract summary: Large Language Models have revolutionized code generation ability by converting natural language descriptions into executable code. CoCoST framework enhances complex code generation by online searching for more information with planned queries and correctness testing for code refinement. CoCoST is validated through rigorous experiments on the DS-1000 and ClassEval datasets.
Score: 51.00909683314142
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models have revolutionized code generation ability by converting natural language descriptions into executable code. However, generating complex code within real-world scenarios remains challenging due to intricate structures, subtle bugs, understanding of advanced data types, and lack of supplementary contents. To address these challenges, we introduce the CoCoST framework, which enhances complex code generation by online searching for more information with planned queries and correctness testing for code refinement. Moreover, CoCoST serializes the complex inputs and outputs to improve comprehension and generates test cases to ensure the adaptability for real-world applications. CoCoST is validated through rigorous experiments on the DS-1000 and ClassEval datasets. Experimental results show that CoCoST substantially improves the quality of complex code generation, highlighting its potential to enhance the practicality of LLMs in generating complex code.

Related papers

IFEvalCode: Controlled Code Generation [69.28317223249358]
The paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs.<n>The authors present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages.
arXiv Detail & Related papers (2025-07-30T08:08:48Z)
CodeEvo: Interaction-Driven Synthesis of Code-centric Data through Hybrid and Iterative Feedback [21.627909324788597]
Acquiring high-quality instruction-code pairs is essential for training Large Language Models.<n>We propose CodeEvo, a framework that synthesizes code data through iterative interactions between two LLM agents.
arXiv Detail & Related papers (2025-07-25T16:12:51Z)
ComplexVCoder: An LLM-Driven Framework for Systematic Generation of Complex Verilog Code [9.68747119462712]
We present ComplexVCoder, an open-source framework that enhances the generation quality and efficiency of complex Verilog code. Specifically, we introduce a two-stage generation mechanism, which leverages an intermediate representation to enable a more structured transition from natural language descriptions to intricate Verilog designs. In addition, we introduce a rule-based alignment method and a domain-specific retrieval-augmented generation (RAG) to further improve the correctness of the synthesized code.
arXiv Detail & Related papers (2025-04-29T11:22:06Z)
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding [49.56049319037421]
KodCode is a synthetic dataset that addresses the persistent challenge of acquiring high-quality, verifiable training data. It comprises question-solution-test triplets that are systematically validated via a self-verification procedure. This pipeline yields a large-scale, robust and diverse coding dataset.
arXiv Detail & Related papers (2025-03-04T19:17:36Z)
EpiCoder: Encompassing Diversity and Complexity in Code Generation [49.170195362149386]
We introduce a novel feature tree-based synthesis framework inspired by Abstract Syntax Trees (AST) Unlike AST, which captures syntactic structure of code, our framework models semantic relationships between code elements. We fine-tuned widely-used base models to create the EpiCoder series, achieving state-of-the-art performance at both the function and file levels.
arXiv Detail & Related papers (2025-01-08T18:58:15Z)
Constraint Back-translation Improves Complex Instruction Following of Large Language Models [55.60192044049083]
Large language models (LLMs) struggle to follow instructions with complex constraints in format, length, etc. Previous works conduct post-training on complex instruction-response pairs generated by feeding complex instructions to advanced LLMs. We propose a novel data generation technique, constraint back-translation.
arXiv Detail & Related papers (2024-10-31T17:42:26Z)
An Empirical Study on Self-correcting Large Language Models for Data Science Code Generation [1.335664823620186]
Large Language Models (LLMs) have recently advanced many applications on software engineering tasks. CoT-SelfEvolve iteratively and automatically refines code through a self-correcting process.
arXiv Detail & Related papers (2024-08-28T09:19:09Z)
Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models [54.51932175059004]
We introduce a scalable method for generating synthetic instructions to enhance the code generation capability of Large Language Models. The proposed algorithm, Genetic-Instruct, mimics evolutionary processes, utilizing self-instruction to create numerous synthetic samples from a limited number of seeds.
arXiv Detail & Related papers (2024-07-29T20:42:59Z)
Case2Code: Scalable Synthetic Data for Code Generation [105.89741089673575]
Large Language Models (LLMs) have shown outstanding breakthroughs in code generation. Recent work improves code LLMs by training on synthetic data generated by some powerful LLMs. We propose a textbfCase2Code task by exploiting the expressiveness and correctness of programs.
arXiv Detail & Related papers (2024-07-17T11:35:00Z)
NoviCode: Generating Programs from Natural Language Utterances by Novices [59.71218039095155]
We present NoviCode, a novel NL Programming task which takes as input an API and a natural language description by a novice non-programmer. We show that NoviCode is indeed a challenging task in the code synthesis domain, and that generating complex code from non-technical instructions goes beyond the current Text-to-Code paradigm.
arXiv Detail & Related papers (2024-07-15T11:26:03Z)
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving [3.3856216159724983]
We introduce a new approach to code generation tasks leveraging multi-agent prompting. Our framework, MapCoder, consists of four LLM agents specifically designed to emulate the stages of program synthesis. Our method consistently delivers superior performance across various programming languages.
arXiv Detail & Related papers (2024-05-18T22:10:15Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
CodeComplex: A Time-Complexity Dataset for Bilingual Source Codes [6.169110187130671]
We introduce CodeComplex, a novel source code dataset where each code is manually annotated with a corresponding worst-case time complexity. To the best of our knowledge, CodeComplex stands as the most extensive code dataset tailored for predicting complexity. We present the outcomes of our experiments employing various baseline models, leveraging state-of-the-art neural models in code comprehension.
arXiv Detail & Related papers (2024-01-16T06:54:44Z)
When Do Program-of-Thoughts Work for Reasoning? [51.2699797837818]
We propose complexity-impacted reasoning score (CIRS) to measure correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity. Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
arXiv Detail & Related papers (2023-08-29T17:22:39Z)
COCO: Testing Code Generation Systems via Concretized Instructions [33.13427092832396]
COCO is a technique to test the robustness of code generation systems. It exploits the usage scenario of code generation systems to make the original programming instruction more concrete. We evaluated COCO on eight advanced code generation systems, including commercial tools such as Copilot and ChatGPT.
arXiv Detail & Related papers (2023-08-25T11:49:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.