Related papers: Information-theoretic User Interaction: Significant Inputs for Program Synthesis

Information-theoretic User Interaction: Significant Inputs for Program Synthesis

URL: http://arxiv.org/abs/2006.12638v1
Date: Mon, 22 Jun 2020 21:46:40 GMT
Title: Information-theoretic User Interaction: Significant Inputs for Program Synthesis
Authors: Ashish Tiwari, Arjun Radhakrishna, Sumit Gulwani, and Daniel Perelman
Abstract summary: We introduce the em significant questions problem, and show that it is hard in general. We develop an information-theoretic greedy approach for solving the problem. In the context of interactive program synthesis, we use the above result to develop an emactive program learner Our active learner is able to tradeoff false negatives for false positives and converge in a small number of iterations on a real-world dataset.
Score: 11.473616777800318
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Programming-by-example technologies are being deployed in industrial products for real-time synthesis of various kinds of data transformations. These technologies rely on the user to provide few representative examples of the transformation task. Motivated by the need to find the most pertinent question to ask the user, in this paper, we introduce the {\em significant questions problem}, and show that it is hard in general. We then develop an information-theoretic greedy approach for solving the problem. We justify the greedy algorithm using the conditional entropy result, which informally says that the question that achieves the maximum information gain is the one that we know least about. In the context of interactive program synthesis, we use the above result to develop an {\em{active program learner}} that generates the significant inputs to pose as queries to the user in each iteration. The procedure requires extending a {\em{passive program learner}} to a {\em{sampling program learner}} that is able to sample candidate programs from the set of all consistent programs to enable estimation of information gain. It also uses clustering of inputs based on features in the inputs and the corresponding outputs to sample a small set of candidate significant inputs. Our active learner is able to tradeoff false negatives for false positives and converge in a small number of iterations on a real-world dataset of %around 800 string transformation tasks.

Related papers

Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance. We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs. We take an incremental computing approach, looking to reuse calculations as the inputs change. We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z)
Multi-Intent Detection in User Provided Annotations for Programming by Examples Systems [3.265146857386153]
Programming by Example (PBE) is a technique that targets automatic inferencing of a computer program to accomplish a format or string conversion task from user-provided input and output samples. In this paper, we propose a deep neural network based ambiguity prediction model, which analyzes the input-output strings and maps them to a different set of properties responsible for multiple intent.
arXiv Detail & Related papers (2023-07-08T12:35:10Z)
Neural Program Synthesis with Query [27.212984312375166]
We propose a query-based framework that trains a neural network to generate informative input-output examples automatically. We evaluate the effectiveness and generalization of the proposed query-based framework on the Karel task and the list processing task. Experimental results show that the query-based framework can generate informative input-output examples which achieve and even outperform well-designed input-output examples.
arXiv Detail & Related papers (2022-05-08T13:53:18Z)
Less is More: Summary of Long Instructions is Better for Program Synthesis [20.66688303609522]
We show that pre-trained language models (LMs) benefit from the summarized version of complicated questions. Our findings show that superfluous information often present in problem description does not help models in understanding a task. Experimental results on Codex show that our proposed approach outperforms baseline by 8.13% on an average in terms of strict accuracy.
arXiv Detail & Related papers (2022-03-16T13:04:12Z)
Recent Developments in Program Synthesis with Evolutionary Algorithms [1.8047694351309207]
We identify the relevant evolutionary program synthesis approaches and provide an in-depth analysis of their performance. The most influential approaches we identify are stack-based, grammar-guided, as well as linear genetic programming. For future work, we encourage researchers not only to use a program's output for assessing the quality of a solution but also the way towards a solution.
arXiv Detail & Related papers (2021-08-27T11:38:27Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning? [53.523017945443115]
We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples. Our results do not depend on the training algorithm or the class of models used for learning.
arXiv Detail & Related papers (2020-12-11T15:25:14Z)
BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration [72.88493072196094]
We present a new synthesis approach that leverages learning to guide a bottom-up search over programs. In particular, we train a model to prioritize compositions of intermediate values during search conditioned on a set of input-output examples. We show that the combination of learning and bottom-up search is remarkably effective, even with simple supervised learning approaches.
arXiv Detail & Related papers (2020-07-28T17:46:18Z)
The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget [164.65771897804404]
In many applications, it is desirable to extract only the relevant information from complex input data. The information bottleneck method formalizes this as an information-theoretic optimization problem. We propose the variational bandwidth bottleneck, which decides for each example on the estimated value of the privileged information.
arXiv Detail & Related papers (2020-04-24T18:29:31Z)
Creating Synthetic Datasets via Evolution for Neural Program Synthesis [77.34726150561087]
We show that some program synthesis approaches generalize poorly to data distributions different from that of the randomly generated examples. We propose a new, adversarial approach to control the bias of synthetic data distributions and show that it outperforms current approaches.
arXiv Detail & Related papers (2020-03-23T18:34:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.