Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
- URL: http://arxiv.org/abs/2310.02304v3
- Date: Fri, 16 Aug 2024 17:28:08 GMT
- Title: Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
- Authors: Eric Zelikman, Eliana Lorch, Lester Mackey, Adam Tauman Kalai,
- Abstract summary: We use a language-model-infused scaffolding program to improve itself.
A variety of self-improvement strategies are proposed by the language model.
It demonstrates that a modern language model, GPT-4, is capable of writing code that can call itself to improve itself.
- Score: 23.31928097405939
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several recent advances in AI systems solve problems by providing a "scaffolding" program that structures multiple calls to language models (LMs) to generate better outputs. A scaffolding program is written in a programming language such as Python. In this work, we use a language-model-infused scaffolding program to improve itself. We start with a seed "improver" that improves an input program according to a given utility function by querying an LM several times and returning the best solution. We then run this seed improver to improve itself. Across a small set of downstream tasks, the resulting improved improver generates programs with significantly better performance than its seed improver. A variety of self-improvement strategies are proposed by the language model, including beam search, genetic algorithms, and simulated annealing. Since the language models themselves are not altered, this is not full recursive self-improvement. Nonetheless, it demonstrates that a modern language model, GPT-4 in our experiments, is capable of writing code that can call itself to improve itself. We consider concerns around the development of self-improving technologies and evaluate the frequency with which the generated code bypasses a sandbox.
Related papers
- Learning to Reason via Program Generation, Emulation, and Search [33.11955431589091]
Program synthesis with language models (LMs) has unlocked a large set of reasoning abilities.
Not all reasoning tasks are easily expressible as code, e.g. tasks involving commonsense reasoning, moral decision-making, and sarcasm understanding.
We propose Code Generation and Emulated EXecution (CoGEX) to extend an LM's program synthesis skills to such tasks.
arXiv Detail & Related papers (2024-05-25T19:40:50Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - The Wisdom of Hindsight Makes Language Models Better Instruction
Followers [84.9120606803906]
Reinforcement learning has seen wide success in finetuning large language models to better align with instructions via human feedback.
In this paper, we consider an alternative approach: converting feedback to instruction by relabeling the original one and training the model for better alignment in a supervised manner.
We propose Hindsight Instruction Relabeling (HIR), a novel algorithm for aligning language models with instructions.
arXiv Detail & Related papers (2023-02-10T12:16:38Z) - Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black
Magic? [5.714553194279462]
We investigate the various input parameters of two language models, and conduct a study to understand if variations of these input parameters can have a significant impact on the quality of the generated programs.
Our results showed that varying the input parameters can significantly improve the performance of language models.
arXiv Detail & Related papers (2022-10-26T13:28:14Z) - Language Models Can Teach Themselves to Program Better [4.627023679353507]
Recent Language Models (LMs) achieve breakthrough performance in code generation when trained on human-authored problems.
We show that it is possible for an LM to synthesize programming problems and solutions, which are filtered for correctness by a Python interpreter.
The LM's performance is then seen to improve when it is fine-tuned on its own synthetic problems and verified solutions.
arXiv Detail & Related papers (2022-07-29T06:43:28Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z) - Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric.
We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z) - AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another.
We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z) - Generating Adversarial Computer Programs using Optimized Obfuscations [43.95037234252815]
We investigate principled ways to adversarially perturb a computer program to fool such learned models.
We use program obfuscations, which have conventionally been used to avoid attempts at reverse engineering programs.
We show that our best attack proposal achieves a $52%$ improvement over a state-of-the-art attack generation approach.
arXiv Detail & Related papers (2021-03-18T10:47:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.