Creating Synthetic Datasets via Evolution for Neural Program Synthesis
- URL: http://arxiv.org/abs/2003.10485v2
- Date: Sat, 25 Jul 2020 01:04:06 GMT
- Title: Creating Synthetic Datasets via Evolution for Neural Program Synthesis
- Authors: Alexander Suh and Yuval Timen
- Abstract summary: We show that some program synthesis approaches generalize poorly to data distributions different from that of the randomly generated examples.
We propose a new, adversarial approach to control the bias of synthetic data distributions and show that it outperforms current approaches.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Program synthesis is the task of automatically generating a program
consistent with a given specification. A natural way to specify programs is to
provide examples of desired input-output behavior, and many current program
synthesis approaches have achieved impressive results after training on
randomly generated input-output examples. However, recent work has discovered
that some of these approaches generalize poorly to data distributions different
from that of the randomly generated examples. We show that this problem applies
to other state-of-the-art approaches as well and that current methods to
counteract this problem are insufficient. We then propose a new, adversarial
approach to control the bias of synthetic data distributions and show that it
outperforms current approaches.
Related papers
- Turaco: Complexity-Guided Data Sampling for Training Neural Surrogates
of Programs [14.940174578659603]
We present a methodology for sampling datasets to train neural-network-based surrogates of programs.
We first characterize the proportion of data to sample from each region of a program's input space based on the complexity of learning a surrogate of the corresponding execution path.
We evaluate these results on a range of real-world programs, demonstrating that complexity-guided sampling results in empirical improvements in accuracy.
arXiv Detail & Related papers (2023-09-21T01:59:20Z) - Learning minimal representations of stochastic processes with
variational autoencoders [52.99137594502433]
We introduce an unsupervised machine learning approach to determine the minimal set of parameters required to describe a process.
Our approach enables for the autonomous discovery of unknown parameters describing processes.
arXiv Detail & Related papers (2023-07-21T14:25:06Z) - Recent Developments in Program Synthesis with Evolutionary Algorithms [1.8047694351309207]
We identify the relevant evolutionary program synthesis approaches and provide an in-depth analysis of their performance.
The most influential approaches we identify are stack-based, grammar-guided, as well as linear genetic programming.
For future work, we encourage researchers not only to use a program's output for assessing the quality of a solution but also the way towards a solution.
arXiv Detail & Related papers (2021-08-27T11:38:27Z) - BOSS: Bidirectional One-Shot Synthesis of Adversarial Examples [8.359029046999233]
A one-shot synthesis of adversarial examples is proposed in this paper.
The inputs are synthesized from scratch to induce arbitrary soft predictions at the output of pre-trained models.
We demonstrate the generality and versatility of the framework and approach proposed through applications to the design of targeted adversarial attacks.
arXiv Detail & Related papers (2021-08-05T17:43:36Z) - Latent Execution for Neural Program Synthesis Beyond Domain-Specific
Languages [97.58968222942173]
We take the first step to synthesize C programs from input-output examples.
In particular, we propose La Synth, which learns the latent representation to approximate the execution of partially generated programs.
We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis.
arXiv Detail & Related papers (2021-06-29T02:21:32Z) - Process Discovery for Structured Program Synthesis [70.29027202357385]
A core task in process mining is process discovery which aims to learn an accurate process model from event log data.
In this paper, we propose to use (block-) structured programs directly as target process models.
We develop a novel bottom-up agglomerative approach to the discovery of such structured program process models.
arXiv Detail & Related papers (2020-08-13T10:33:10Z) - Partially Conditioned Generative Adversarial Networks [75.08725392017698]
Generative Adversarial Networks (GANs) let one synthesise artificial datasets by implicitly modelling the underlying probability distribution of a real-world training dataset.
With the introduction of Conditional GANs and their variants, these methods were extended to generating samples conditioned on ancillary information available for each sample within the dataset.
In this work, we argue that standard Conditional GANs are not suitable for such a task and propose a new Adversarial Network architecture and training strategy.
arXiv Detail & Related papers (2020-07-06T15:59:28Z) - Synthetic Datasets for Neural Program Synthesis [66.20924952964117]
We propose a new methodology for controlling and evaluating the bias of synthetic data distributions over both programs and specifications.
We demonstrate, using the Karel DSL and a small Calculator DSL, that training deep networks on these distributions leads to improved cross-distribution generalization performance.
arXiv Detail & Related papers (2019-12-27T21:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.