Related papers: Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

URL: http://arxiv.org/abs/2310.02304v2
Date: Fri, 1 Mar 2024 17:11:21 GMT
Title: Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
Authors: Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai
Abstract summary: We use a language-model-infused scaffolding program to improve itself. A variety of self-improvement strategies are proposed by the language model. It demonstrates that a modern language model, GPT-4, is capable of writing code that can call itself to improve itself.
Score: 25.474639218436916
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several recent advances in AI systems (e.g., Tree-of-Thoughts and Program-Aided Language Models) solve problems by providing a "scaffolding" program that structures multiple calls to language models to generate better outputs. A scaffolding program is written in a programming language such as Python. In this work, we use a language-model-infused scaffolding program to improve itself. We start with a seed "improver" that improves an input program according to a given utility function by querying a language model several times and returning the best solution. We then run this seed improver to improve itself. Across a small set of downstream tasks, the resulting improved improver generates programs with significantly better performance than its seed improver. A variety of self-improvement strategies are proposed by the language model, including beam search, genetic algorithms, and simulated annealing. Since the language models themselves are not altered, this is not full recursive self-improvement. Nonetheless, it demonstrates that a modern language model, GPT-4 in our experiments, is capable of writing code that can call itself to improve itself. We consider concerns around the development of self-improving technologies and evaluate the frequency with which the generated code bypasses a sandbox.

Related papers

Learning to Reason via Program Generation, Emulation, and Search [33.11955431589091]
Program synthesis with language models (LMs) has unlocked a large set of reasoning abilities. Not all reasoning tasks are easily expressible as code, e.g. tasks involving commonsense reasoning, moral decision-making, and sarcasm understanding. We propose Code Generation and Emulated EXecution (CoGEX) to extend an LM's program synthesis skills to such tasks.
arXiv Detail & Related papers (2024-05-25T19:40:50Z)
A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back. Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair. This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z)
Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation. We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z)
The Wisdom of Hindsight Makes Language Models Better Instruction Followers [84.9120606803906]
Reinforcement learning has seen wide success in finetuning large language models to better align with instructions via human feedback. In this paper, we consider an alternative approach: converting feedback to instruction by relabeling the original one and training the model for better alignment in a supervised manner. We propose Hindsight Instruction Relabeling (HIR), a novel algorithm for aligning language models with instructions.
arXiv Detail & Related papers (2023-02-10T12:16:38Z)
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic? [5.714553194279462]
We investigate the various input parameters of two language models, and conduct a study to understand if variations of these input parameters can have a significant impact on the quality of the generated programs. Our results showed that varying the input parameters can significantly improve the performance of language models.
arXiv Detail & Related papers (2022-10-26T13:28:14Z)
Language Models Can Teach Themselves to Program Better [4.627023679353507]
Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on human-authored problems. We show that it is possible for an LM to synthesize programming problems and solutions, which are filtered for correctness by a Python interpreter. The LM's performance is then seen to improve when it is fine-tuned on its own synthetic problems and verified solutions.
arXiv Detail & Related papers (2022-07-29T06:43:28Z)
Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection. We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z)
Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric. We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z)
AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another. We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z)
Generating Adversarial Computer Programs using Optimized Obfuscations [43.95037234252815]
We investigate principled ways to adversarially perturb a computer program to fool such learned models. We use program obfuscations, which have conventionally been used to avoid attempts at reverse engineering programs. We show that our best attack proposal achieves a $52%$ improvement over a state-of-the-art attack generation approach.
arXiv Detail & Related papers (2021-03-18T10:47:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.