Energy-Based Models for Code Generation under Compilability Constraints
- URL: http://arxiv.org/abs/2106.04985v1
- Date: Wed, 9 Jun 2021 11:06:32 GMT
- Title: Energy-Based Models for Code Generation under Compilability Constraints
- Authors: Tomasz Korbak and Hady Elsahar and Marc Dymetman and Germ\'an
Kruszewski
- Abstract summary: In this work, we pose the problem of learning to generate compilable code as constraint satisfaction.
We define an Energy-Based Model (EBM) representing a pre-trained generative model with an imposed constraint of generating only compilable sequences.
We then use the KL-Adaptive Distributional Policy Gradient algorithm to train a generative model approxing the EBM.
- Score: 2.9176992922046923
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural language models can be successfully trained on source code, leading to
applications such as code completion. However, their versatile autoregressive
self-supervision objective overlooks important global sequence-level features
that are present in the data such as syntactic correctness or compilability. In
this work, we pose the problem of learning to generate compilable code as
constraint satisfaction. We define an Energy-Based Model (EBM) representing a
pre-trained generative model with an imposed constraint of generating only
compilable sequences. We then use the KL-Adaptive Distributional Policy
Gradient algorithm (Khalifa et al., 2021) to train a generative model
approximating the EBM. We conduct experiments showing that our proposed
approach is able to improve compilability rates without sacrificing diversity
and complexity of the generated samples.
Related papers
- LLM as a code generator in Agile Model Driven Development [1.12646803578849]
This research champions Model Driven Development (MDD) as a viable strategy to overcome these challenges.
We propose an Agile Model Driven Development (AMDD) approach that employs GPT4 as a code generator.
Applying GPT4 auto generation capabilities yields Java and Python code that is compatible with the JADE and PADE frameworks.
arXiv Detail & Related papers (2024-10-24T07:24:11Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - Instructed Language Models with Retrievers Are Powerful Entity Linkers [87.16283281290053]
Instructed Generative Entity Linker (INSGENEL) is the first approach that enables casual language models to perform entity linking over knowledge bases.
INSGENEL outperforms previous generative alternatives with +6.8 F1 points gain on average.
arXiv Detail & Related papers (2023-11-06T16:38:51Z) - A Hybrid of Generative and Discriminative Models Based on the
Gaussian-coupled Softmax Layer [5.33024001730262]
We propose a method to train a hybrid of discriminative and generative models in a single neural network.
We demonstrate that the proposed hybrid model can be applied to semi-supervised learning and confidence calibration.
arXiv Detail & Related papers (2023-05-10T05:48:22Z) - Distributional Learning of Variational AutoEncoder: Application to
Synthetic Data Generation [0.7614628596146602]
We propose a new approach that expands the model capacity without sacrificing the computational advantages of the VAE framework.
Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution.
We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
arXiv Detail & Related papers (2023-02-22T11:26:50Z) - Is Conditional Generative Modeling all you need for Decision-Making? [19.39663779573325]
We show that conditional generative modeling is a powerful tool for decision-making.
Our results illustrate that conditional generative modeling is a powerful tool for decision-making.
arXiv Detail & Related papers (2022-11-28T18:59:02Z) - Controllable and Compositional Generation with Latent-Space Energy-Based
Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications.
In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes.
By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z) - Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning.
We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities.
We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.