Planning with Large Language Models for Code Generation
- URL: http://arxiv.org/abs/2303.05510v1
- Date: Thu, 9 Mar 2023 18:59:47 GMT
- Title: Planning with Large Language Models for Code Generation
- Authors: Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B.
Tenenbaum, Chuang Gan
- Abstract summary: Planning-Guided Transformer Decoding (PG-TD) uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs.
We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks.
- Score: 100.07232672883897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing large language model-based code generation pipelines typically use
beam search or sampling algorithms during the decoding process. Although the
programs they generate achieve high token-matching-based scores, they often
fail to compile or generate incorrect outputs. The main reason is that
conventional Transformer decoding algorithms may not be the best choice for
code generation. In this work, we propose a novel Transformer decoding
algorithm, Planning-Guided Transformer Decoding (PG-TD), that uses a planning
algorithm to do lookahead search and guide the Transformer to generate better
programs. Specifically, instead of simply optimizing the likelihood of the
generated sequences, the Transformer makes use of a planner to generate
candidate programs and test them on public test cases. The Transformer can
therefore make more informed decisions and generate tokens that will eventually
lead to higher-quality programs. We also design a mechanism that shares
information between the Transformer and the planner to make our algorithm
computationally efficient. We empirically evaluate our framework with several
large language models as backbones on public coding challenge benchmarks,
showing that 1) it can generate programs that consistently achieve higher
performance compared with competing baseline methods; 2) it enables
controllable code generation, such as concise codes and highly-commented codes
by optimizing modified objective.
Related papers
- Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized.
We find that these random transformers can perform a wide range of meaningful algorithmic tasks.
Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z) - Make Every Move Count: LLM-based High-Quality RTL Code Generation Using
MCTS [20.135906487081453]
We present an automated transformer decoding algorithm that integrates Monte Carlo tree-search for lookahead.
For the largest design generated by the state-of-the-art LLM (16-bit adder), our technique can achieve a 31.8% improvement in the area-delay product.
arXiv Detail & Related papers (2024-02-05T18:47:04Z) - Converting Epics/Stories into Pseudocode using Transformers [0.0]
Pseudocode is a programming language representation of the steps involved in a computer program.
We present a methodology to convert a problem described in the English language into pseudocode.
We find that the CodeT5 model gives the best results in terms of BLEU score when trained separately on the two subtasks mentioned above.
arXiv Detail & Related papers (2023-12-08T14:01:09Z) - Learning Transformer Programs [78.9509560355733]
We introduce a procedure for training Transformers that are mechanistically interpretable by design.
Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization.
The Transformer Programs can automatically find reasonable solutions, performing on par with standard Transformers of comparable size.
arXiv Detail & Related papers (2023-06-01T20:27:01Z) - SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly [6.080751346188323]
This paper presents SLaDe, a Small Language model Decompiler based on a sequence-to-sequence transformer trained over real-world code.
We utilize type-inference to generate programs that are more readable and accurate than standard analytic and recent neural approaches.
arXiv Detail & Related papers (2023-05-21T17:31:39Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - Transformer with Tree-order Encoding for Neural Program Generation [8.173517923612426]
We introduce a tree-based positional encoding and a shared natural-language subword vocabulary for Transformers.
Our findings suggest that employing a tree-based positional encoding in combination with a shared natural-language subword vocabulary improves generation performance over sequential positional encodings.
arXiv Detail & Related papers (2022-05-30T12:27:48Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z) - Instantaneous Grammatical Error Correction with Shallow Aggressive
Decoding [57.08875260900373]
We propose Shallow Aggressive Decoding (SAD) to improve the online inference efficiency of the Transformer for instantaneous Grammatical Error Correction (GEC)
SAD aggressively decodes as many tokens as possible in parallel instead of always decoding only one token in each step to improve computational parallelism.
Experiments in both English and Chinese GEC benchmarks show that aggressive decoding could yield the same predictions but with a significant speedup for online inference.
arXiv Detail & Related papers (2021-06-09T10:30:59Z) - Relevance Transformer: Generating Concise Code Snippets with Relevance
Feedback [6.230751621285322]
We introduce and study modern Transformer architectures for explicit code generation.
We propose a new model called the Relevance Transformer that incorporates external knowledge using pseudo-relevance feedback.
The results show improvements over state-of-the-art methods based on BLEU evaluation.
arXiv Detail & Related papers (2020-07-06T09:54:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.