Related papers: Planning with Large Language Models for Code Generation

Planning with Large Language Models for Code Generation

URL: http://arxiv.org/abs/2303.05510v1
Date: Thu, 9 Mar 2023 18:59:47 GMT
Title: Planning with Large Language Models for Code Generation
Authors: Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, Chuang Gan
Abstract summary: Planning-Guided Transformer Decoding (PG-TD) uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs. We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks.
Score: 100.07232672883897
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing large language model-based code generation pipelines typically use beam search or sampling algorithms during the decoding process. Although the programs they generate achieve high token-matching-based scores, they often fail to compile or generate incorrect outputs. The main reason is that conventional Transformer decoding algorithms may not be the best choice for code generation. In this work, we propose a novel Transformer decoding algorithm, Planning-Guided Transformer Decoding (PG-TD), that uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs. Specifically, instead of simply optimizing the likelihood of the generated sequences, the Transformer makes use of a planner to generate candidate programs and test them on public test cases. The Transformer can therefore make more informed decisions and generate tokens that will eventually lead to higher-quality programs. We also design a mechanism that shares information between the Transformer and the planner to make our algorithm computationally efficient. We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks, showing that 1) it can generate programs that consistently achieve higher performance compared with competing baseline methods; 2) it enables controllable code generation, such as concise codes and highly-commented codes by optimizing modified objective.

Related papers

Algorithmic Capabilities of Random Transformers [49.73113518329544]
We investigate what functions can be learned by randomly transformers in which only the embedding layers are optimized. We find that these random transformers can perform a wide range of meaningful algorithmic tasks. Our results indicate that some algorithmic capabilities are present in transformers even before these models are trained.
arXiv Detail & Related papers (2024-10-06T06:04:23Z)
AlgoFormer: An Efficient Transformer Framework with Algorithmic Structures [80.28359222380733]
We design a novel transformer framework, dubbed AlgoFormer, to empower transformers with algorithmic capabilities. In particular, inspired by the structure of human-designed learning algorithms, our transformer framework consists of a pre-transformer that is responsible for task preprocessing. Some theoretical and empirical results are presented to show that the designed transformer has the potential to perform algorithm representation and learning.
arXiv Detail & Related papers (2024-02-21T07:07:54Z)
Make Every Move Count: LLM-based High-Quality RTL Code Generation Using MCTS [20.135906487081453]
We present an automated transformer decoding algorithm that integrates Monte Carlo tree-search for lookahead. For the largest design generated by the state-of-the-art LLM (16-bit adder), our technique can achieve a 31.8% improvement in the area-delay product.
arXiv Detail & Related papers (2024-02-05T18:47:04Z)
Converting Epics/Stories into Pseudocode using Transformers [0.0]
Pseudocode is a programming language representation of the steps involved in a computer program. We present a methodology to convert a problem described in the English language into pseudocode. We find that the CodeT5 model gives the best results in terms of BLEU score when trained separately on the two subtasks mentioned above.
arXiv Detail & Related papers (2023-12-08T14:01:09Z)
Learning Transformer Programs [78.9509560355733]
We introduce a procedure for training Transformers that are mechanistically interpretable by design. Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization. The Transformer Programs can automatically find reasonable solutions, performing on par with standard Transformers of comparable size.
arXiv Detail & Related papers (2023-06-01T20:27:01Z)
SLaDe: A Portable Small Language Model Decompiler for Optimized Assembly [6.080751346188323]
This paper presents SLaDe, a Small Language model Decompiler based on a sequence-to-sequence transformer trained over real-world code. We utilize type-inference to generate programs that are more readable and accurate than standard analytic and recent neural approaches.
arXiv Detail & Related papers (2023-05-21T17:31:39Z)
Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it. Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z)
Transformer with Tree-order Encoding for Neural Program Generation [8.173517923612426]
We introduce a tree-based positional encoding and a shared natural-language subword vocabulary for Transformers. Our findings suggest that employing a tree-based positional encoding in combination with a shared natural-language subword vocabulary improves generation performance over sequential positional encodings.
arXiv Detail & Related papers (2022-05-30T12:27:48Z)
Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language. We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer. We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z)
Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding [57.08875260900373]
We propose Shallow Aggressive Decoding (SAD) to improve the online inference efficiency of the Transformer for instantaneous Grammatical Error Correction (GEC) SAD aggressively decodes as many tokens as possible in parallel instead of always decoding only one token in each step to improve computational parallelism. Experiments in both English and Chinese GEC benchmarks show that aggressive decoding could yield the same predictions but with a significant speedup for online inference.
arXiv Detail & Related papers (2021-06-09T10:30:59Z)
Relevance Transformer: Generating Concise Code Snippets with Relevance Feedback [6.230751621285322]
We introduce and study modern Transformer architectures for explicit code generation. We propose a new model called the Relevance Transformer that incorporates external knowledge using pseudo-relevance feedback. The results show improvements over state-of-the-art methods based on BLEU evaluation.
arXiv Detail & Related papers (2020-07-06T09:54:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.