Related papers: Multi-modal Program Inference: a Marriage of Pre-trainedLanguage Models and Component-based Synthesis

Multi-modal Program Inference: a Marriage of Pre-trainedLanguage Models and Component-based Synthesis

URL: http://arxiv.org/abs/2109.02445v1
Date: Fri, 3 Sep 2021 16:12:04 GMT
Title: Multi-modal Program Inference: a Marriage of Pre-trainedLanguage Models and Component-based Synthesis
Authors: Kia Rahmani, Mohammad Raza, Sumit Gulwani, Vu Le, Daniel Morris, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari
Abstract summary: Multi-modal program synthesis refers to the task of synthesizing programs (code) from their specification given in different forms. Examples provide a precise but incomplete specification, and natural language provides an ambiguous but more "complete" task description. We use our combination approach to instantiate multi-modal synthesis systems for two programming domains.
Score: 15.427687814482724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-modal program synthesis refers to the task of synthesizing programs (code) from their specification given in different forms, such as a combination of natural language and examples. Examples provide a precise but incomplete specification, and natural language provides an ambiguous but more "complete" task description. Machine-learned pre-trained models (PTMs) are adept at handling ambiguous natural language, but struggle with generating syntactically and semantically precise code. Program synthesis techniques can generate correct code, often even from incomplete but precise specifications, such as examples, but they are unable to work with the ambiguity of natural languages. We present an approach that combines PTMs with component-based synthesis (CBS): PTMs are used to generate candidates programs from the natural language description of the task, which are then used to guide the CBS procedure to find the program that matches the precise examples-based specification. We use our combination approach to instantiate multi-modal synthesis systems for two programming domains: the domain of regular expressions and the domain of CSS selectors. Our evaluation demonstrates the effectiveness of our domain-agnostic approach in comparison to a state-of-the-art specialized system, and the generality of our approach in providing multi-modal program synthesis from natural language and examples in different programming domains.

Related papers

Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability. The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts. As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z)
Programming by Example and Text-to-Code Translation for Conversational Code Generation [1.8447697408534178]
We propose a method for integrating Programming by Example and text-to-code systems. MPaTHS offers an accessible natural language interface for synthesizing general programs. We present a program representation that allows our method to be applied to the problem of task-oriented dialogue.
arXiv Detail & Related papers (2022-11-21T15:20:45Z)
Benchmarking Language Models for Code Syntax Understanding [79.11525961219591]
Pre-trained language models have demonstrated impressive performance in both natural language processing and program understanding. In this work, we perform the first thorough benchmarking of the state-of-the-art pre-trained models for identifying the syntactic structures of programs. Our findings point out key limitations of existing pre-training methods for programming languages, and suggest the importance of modeling code syntactic structures.
arXiv Detail & Related papers (2022-10-26T04:47:18Z)
A Conversational Paradigm for Program Synthesis [110.94409515865867]
We propose a conversational program synthesis approach via large language models. We train a family of large language models, called CodeGen, on natural language and programming language data. Our findings show the emergence of conversational capabilities and the effectiveness of the proposed conversational program synthesis paradigm.
arXiv Detail & Related papers (2022-03-25T06:55:15Z)
Synchromesh: Reliable code generation from pre-trained language models [38.15391794443022]
We propose Synchromesh: a framework for substantially improving the reliability of pre-trained models for code generation. First, it retrieves few-shot examples from a training bank using Target Similarity Tuning (TST), a novel method for semantic example selection. Then, Synchromesh feeds the examples to a pre-trained language model and samples programs using Constrained Semantic Decoding (CSD), a general framework for constraining the output to a set of valid programs in the target language.
arXiv Detail & Related papers (2022-01-26T22:57:44Z)
Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages [97.58968222942173]
We take the first step to synthesize C programs from input-output examples. In particular, we propose La Synth, which learns the latent representation to approximate the execution of partially generated programs. We show that training on these synthesized programs further improves the prediction performance for both Karel and C program synthesis.
arXiv Detail & Related papers (2021-06-29T02:21:32Z)
Leveraging Language to Learn Program Abstractions and Search Heuristics [66.28391181268645]
We introduce LAPS (Language for Abstraction and Program Search), a technique for using natural language annotations to guide joint learning of libraries and neurally-guided search models for synthesis. When integrated into a state-of-the-art library learning system (DreamCoder), LAPS produces higher-quality libraries and improves search efficiency and generalization.
arXiv Detail & Related papers (2021-06-18T15:08:47Z)
Representing Partial Programs with Blended Abstract Semantics [62.20775388513027]
We introduce a technique for representing partially written programs in a program synthesis engine. We learn an approximate execution model implemented as a modular neural network. We show that these hybrid neuro-symbolic representations enable execution-guided synthesizers to use more powerful language constructs.
arXiv Detail & Related papers (2020-12-23T20:40:18Z)
Optimal Neural Program Synthesis from Multimodal Specifications [45.35689345004124]
Multimodal program synthesis is an attractive way to scale program synthesis to challenging settings. This paper proposes an optimal neural synthesis approach where the goal is to find a program that satisfies user-provided constraints.
arXiv Detail & Related papers (2020-10-04T20:51:21Z)
Program Synthesis with Pragmatic Communication [28.24612900419843]
This work introduces a new inductive bias derived by modeling the program synthesis task as rational communication. A user study finds that end-user participants communicate more effectively with the pragmatic program synthesizer over a non-pragmatic one.
arXiv Detail & Related papers (2020-07-09T20:55:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.