Learning from Self-Sampled Correct and Partially-Correct Programs
- URL: http://arxiv.org/abs/2205.14318v1
- Date: Sat, 28 May 2022 03:31:07 GMT
- Title: Learning from Self-Sampled Correct and Partially-Correct Programs
- Authors: Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov,
Christopher Meek, Dragomir Radev, Jianfeng Gao
- Abstract summary: We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs.
We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process.
Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
- Score: 96.66452896657991
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Program synthesis aims to generate executable programs that are consistent
with the user specification. While there are often multiple programs that
satisfy the same user specification, existing neural program synthesis models
are often only learned from one reference program by maximizing its
log-likelihood. This causes the model to be overly confident in its predictions
as it sees the single solution repeatedly during training. This leads to poor
generalization on unseen examples, even when multiple attempts are allowed. To
mitigate this issue, we propose to let the model perform sampling during
training and learn from both self-sampled fully-correct programs, which yield
the gold execution results, as well as partially-correct programs, whose
intermediate execution state matches another correct program. We show that our
use of self-sampled correct and partially-correct programs can benefit learning
and help guide the sampling process, leading to more efficient exploration of
the program space. Additionally, we explore various training objectives to
support learning from multiple programs per example and find they greatly
affect the performance. Experiments on the MathQA and GSM8K datasets show that
our proposed method improves the pass@k performance by 3.1% to 12.3% compared
to learning from a single reference program with MLE.
Related papers
- Learning to Reason via Program Generation, Emulation, and Search [33.11955431589091]
Program synthesis with language models (LMs) has unlocked a large set of reasoning abilities.
Not all reasoning tasks are easily expressible as code, e.g. tasks involving commonsense reasoning, moral decision-making, and sarcasm understanding.
We propose Code Generation and Emulated EXecution (CoGEX) to extend an LM's program synthesis skills to such tasks.
arXiv Detail & Related papers (2024-05-25T19:40:50Z) - NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs.
We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z) - Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates
of Programs [14.940174578659603]
We present a methodology for sampling datasets to train neural-network-based surrogates of programs.
We first characterize the proportion of data to sample from each region of a program's input space based on the complexity of learning a surrogate of the corresponding execution path.
We evaluate these results on a range of real-world programs, demonstrating that complexity-guided sampling results in empirical improvements in accuracy.
arXiv Detail & Related papers (2023-09-21T01:59:20Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - Hierarchical Programmatic Reinforcement Learning via Learning to Compose
Programs [58.94569213396991]
We propose a hierarchical programmatic reinforcement learning framework to produce program policies.
By learning to compose programs, our proposed framework can produce program policies that describe out-of-distributionally complex behaviors.
The experimental results in the Karel domain show that our proposed framework outperforms baselines.
arXiv Detail & Related papers (2023-01-30T14:50:46Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - Enforcing Consistency in Weakly Supervised Semantic Parsing [68.2211621631765]
We explore the use of consistency between the output programs for related inputs to reduce the impact of spurious programs.
We find that a more consistent formalism leads to improved model performance even without consistency-based training.
arXiv Detail & Related papers (2021-07-13T03:48:04Z) - Learning to Combine Per-Example Solutions for Neural Program Synthesis [35.0204840620086]
Most learning-based approaches try to find a program that satisfies all examples at once.
Our work considers an approach that breaks the problem into two stages: (a) find programs that satisfy only one example, and (b) leverage these per-example solutions to yield a program that satisfies all examples.
arXiv Detail & Related papers (2021-06-14T05:48:12Z) - Neural Program Synthesis with a Differentiable Fixer [44.48509453344902]
We present a new program synthesis approach that combines an encoder-decoder based synthesis architecture with a differentiable program fixer.
We train our architecture end-to-end on the RobustFill domain, and show that the addition of the fixer module leads to a significant improvement on synthesis accuracy.
arXiv Detail & Related papers (2020-06-19T01:49:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.