Related papers: Can humans teach machines to code?

Can humans teach machines to code?

URL: http://arxiv.org/abs/2404.19397v1
Date: Tue, 30 Apr 2024 09:42:40 GMT
Title: Can humans teach machines to code?
Authors: Céline Hocquette, Johannes Langer, Andrew Cropper, Ute Schmid,
Abstract summary: A key underlying assumption is that humans can provide examples of sufficient quality to teach a concept to a machine. We ask humans to generate examples for six programming tasks, such as finding the maximum element of a list. We compare the performance of a program synthesis system trained on (i) human-provided examples, (ii) randomly sampled examples, and (iii) expert-provided examples.
Score: 24.32052793811087
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The goal of inductive program synthesis is for a machine to automatically generate a program from user-supplied examples of the desired behaviour of the program. A key underlying assumption is that humans can provide examples of sufficient quality to teach a concept to a machine. However, as far as we are aware, this assumption lacks both empirical and theoretical support. To address this limitation, we explore the question `Can humans teach machines to code?'. To answer this question, we conduct a study where we ask humans to generate examples for six programming tasks, such as finding the maximum element of a list. We compare the performance of a program synthesis system trained on (i) human-provided examples, (ii) randomly sampled examples, and (iii) expert-provided examples. Our results show that, on most of the tasks, non-expert participants did not provide sufficient examples for a program synthesis system to learn an accurate program. Our results also show that non-experts need to provide more examples than both randomly sampled and expert-provided examples.

Related papers

Human-AI Co-Creation of Worked Examples for Programming Classes [1.5663705658818543]
We introduce an authoring system for creating Java worked examples that generates a starting version of code explanations. We also present a study that assesses the quality of explanations created with this approach.
arXiv Detail & Related papers (2024-02-26T01:44:24Z)
Generating Pragmatic Examples to Train Neural Program Synthesizers [20.819451354452085]
A good synthesizer must choose the intended program from the many that are consistent with the given set of examples. We propose a novel way to amortize this search with neural networks.
arXiv Detail & Related papers (2023-11-09T20:53:00Z)
Program Generation from Diverse Video Demonstrations [49.202289347899836]
Generalising over multiple observations is a task that has historically presented difficulties for machines to grasp. We propose a model that can extract general rules from video demonstrations by simultaneously performing summarisation and translation.
arXiv Detail & Related papers (2023-02-01T01:51:45Z)
Toward Trustworthy Neural Program Synthesis [6.3557174349423455]
We develop an approach to estimate the probability that a program sampled from a large language model is correct. Given a natural language description of a programming problem, our method samples both candidate programs as well as candidate predicates specifying how the program should behave.
arXiv Detail & Related papers (2022-09-29T20:32:07Z)
Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it. Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z)
Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs. We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process. Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z)
Efficient Pragmatic Program Synthesis with Informative Specifications [13.234975857626752]
We show that it is possible to build a program synthesizer that is both pragmatic and efficient by approximating the joint distribution of programs with a product of independent factors. We find that the synthesizer assuming a factored approximation performs better than a synthesizer assuming an exact joint distribution when evaluated on natural human inputs.
arXiv Detail & Related papers (2022-04-05T21:25:58Z)
Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness [92.26039686430204]
We show that even in carefully-designed systems, complementary performance can be elusive. First, we provide a theoretical framework for modeling simple human-algorithm systems. Next, we use this model to prove conditions where complementarity is impossible.
arXiv Detail & Related papers (2022-02-17T18:44:41Z)
Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric. We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z)
Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages [97.58968222942173]
We take the first step to synthesize C programs from input-output examples. In particular, we propose La Synth, which learns the latent representation to approximate the execution of partially generated programs. We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis.
arXiv Detail & Related papers (2021-06-29T02:21:32Z)
Latent Programmer: Discrete Latent Codes for Program Synthesis [56.37993487589351]
In many sequence learning tasks, such as program synthesis and document summarization, a key problem is searching over a large space of possible output sequences. We propose to learn representations of the outputs that are specifically meant for search: rich enough to specify the desired output but compact enough to make search more efficient. We introduce the emphLatent Programmer, a program synthesis method that first predicts a discrete latent code from input/output examples, and then generates the program in the target language.
arXiv Detail & Related papers (2020-12-01T10:11:35Z)
A Note on High-Probability versus In-Expectation Guarantees of Generalization Bounds in Machine Learning [95.48744259567837]
Statistical machine learning theory often tries to give generalization guarantees of machine learning models. Statements made about the performance of machine learning models have to take the sampling process into account. We show how one may transform one statement to another.
arXiv Detail & Related papers (2020-10-06T09:41:35Z)
Program Synthesis with Pragmatic Communication [28.24612900419843]
This work introduces a new inductive bias derived by modeling the program synthesis task as rational communication. A user study finds that end-user participants communicate more effectively with the pragmatic program synthesizer over a non-pragmatic one.
arXiv Detail & Related papers (2020-07-09T20:55:44Z)
Learning large logic programs by going beyond entailment [18.27510863075184]
We implement our idea in Brute, a new ILP system which uses best-first search, guided by an example-dependent loss function, to incrementally build programs. Our experiments show that Brute can substantially outperform existing ILP systems in terms of predictive accuracies and learning times.
arXiv Detail & Related papers (2020-04-21T09:31:06Z)
Creating Synthetic Datasets via Evolution for Neural Program Synthesis [77.34726150561087]
We show that some program synthesis approaches generalize poorly to data distributions different from that of the randomly generated examples. We propose a new, adversarial approach to control the bias of synthetic data distributions and show that it outperforms current approaches.
arXiv Detail & Related papers (2020-03-23T18:34:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.