Type-Constrained Code Generation with Language Models
- URL: http://arxiv.org/abs/2504.09246v1
- Date: Sat, 12 Apr 2025 15:03:00 GMT
- Title: Type-Constrained Code Generation with Language Models
- Authors: Niels Mündler, Jingxuan He, Hao Wang, Koushik Sen, Dawn Song, Martin Vechev,
- Abstract summary: Large language models (LLMs) produce uncompilable output because their next-token inference procedure does not model formal aspects of code.<n>We introduce a type-constrained decoding approach that leverages type systems to guide code generation.<n>Our approach reduces compilation errors by more than half and increases functional correctness in code synthesis, translation, and repair tasks.
- Score: 51.03439021895432
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved notable success in code generation. However, they still frequently produce uncompilable output because their next-token inference procedure does not model formal aspects of code. Although constrained decoding is a promising approach to alleviate this issue, it has only been applied to handle either domain-specific languages or syntactic language features. This leaves typing errors, which are beyond the domain of syntax and generally hard to adequately constrain. To address this challenge, we introduce a type-constrained decoding approach that leverages type systems to guide code generation. We develop novel prefix automata for this purpose and introduce a sound approach to enforce well-typedness based on type inference and a search over inhabitable types. We formalize our approach on a simply-typed language and extend it to TypeScript to demonstrate practicality. Our evaluation on HumanEval shows that our approach reduces compilation errors by more than half and increases functional correctness in code synthesis, translation, and repair tasks across LLMs of various sizes and model families, including SOTA open-weight models with more than 30B parameters.
Related papers
- Automata-based constraints for language model decoding [9.137697105669142]
Language models (LMs) are often expected to generate strings in some formal language.
tuning requires significant resources, making it impractical for uncommon or task-specific formats.
We solve these issues through the application of automata theory.
Our system compiles constraints 7,000x faster, is provably correct, and can be extended in a modular fashion.
arXiv Detail & Related papers (2024-07-11T00:25:01Z) - ConCodeEval: Evaluating Large Language Models for Code Constraints in Domain-Specific Languages [35.170835339618414]
Large Language Models (LLMs) struggle to understand natural language constraints for various text generation tasks.
Code languages that perform excellently for normal code tasks do not perform well when the same languages represent fine-grained constraints.
We introduce ConCodeEval, a first-of-its-kind benchmark having two novel tasks for code constraints across five representations.
arXiv Detail & Related papers (2024-07-03T08:36:13Z) - Supporting Cross-language Cross-project Bug Localization Using Pre-trained Language Models [2.5121668584771837]
Existing techniques often struggle with generalizability and deployment due to their reliance on application-specific data.
This paper proposes a novel pre-trained language model (PLM) based technique for bug localization that transcends project and language boundaries.
arXiv Detail & Related papers (2024-07-03T01:09:36Z) - Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs [57.27982780697922]
Large language models have demonstrated exceptional capability in natural language understanding and generation.
However, their generation speed is limited by the inherently sequential nature of their decoding process.
This paper introduces Lexical Unit Decoding, a novel decoding methodology implemented in a data-driven manner.
arXiv Detail & Related papers (2024-05-24T04:35:13Z) - Understanding How CodeLLMs (Mis)Predict Types with Activation Steering [1.7252995245478464]
We investigate what happens when a model mispredicts a type.
We show that by applying semantics-preserving edits to code, CodeLLMs are eventually misled into mispredicting type annotations.
We show that steering achieves comparable performance to fine-tuning directly on the type prediction task.
arXiv Detail & Related papers (2024-04-02T12:44:44Z) - Coding by Design: GPT-4 empowers Agile Model Driven Development [0.03683202928838613]
This research offers an Agile Model-Driven Development (MDD) approach that enhances code auto-generation using OpenAI's GPT-4.
Our work emphasizes "Agility" as a significant contribution to the current MDD method, particularly when the model undergoes changes or needs deployment in a different programming language.
Ultimately, leveraging GPT-4, our last layer auto-generates code in both Java and Python.
arXiv Detail & Related papers (2023-10-06T15:05:05Z) - Generative Type Inference for Python [62.01560866916557]
This paper introduces TypeGen, a few-shot generative type inference approach that incorporates static domain knowledge from static analysis.
TypeGen creates chain-of-thought (COT) prompts by translating the type inference steps of static analysis into prompts based on the type dependency graphs (TDGs)
Experiments show that TypeGen outperforms the best baseline Type4Py by 10.0% for argument type prediction and 22.5% in return value type prediction in terms of top-1 Exact Match.
arXiv Detail & Related papers (2023-07-18T11:40:31Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z) - Limits of Detecting Text Generated by Large-Scale Language Models [65.46403462928319]
Some consider large-scale language models that can generate long and coherent pieces of text as dangerous, since they may be used in misinformation campaigns.
Here we formulate large-scale language model output detection as a hypothesis testing problem to classify text as genuine or generated.
arXiv Detail & Related papers (2020-02-09T19:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.