PAC Prediction Sets for Large Language Models of Code
- URL: http://arxiv.org/abs/2302.08703v2
- Date: Wed, 21 Jun 2023 02:35:39 GMT
- Title: PAC Prediction Sets for Large Language Models of Code
- Authors: Adam Khakhar, Stephen Mell, Osbert Bastani
- Abstract summary: We propose a solution that considers a restricted set of prediction sets that can compactly be represented as partial programs.
This is the first research contribution that generates PAC prediction sets for generative code models.
- Score: 19.071829387911276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prediction sets have recently been shown to be a promising strategy for
quantifying the uncertainty of deep neural networks in a way that provides
theoretical guarantees. However, existing techniques have largely targeted
settings where the space of labels is simple, so prediction sets can be
arbitrary subsets of labels. For structured prediction problems where the space
of labels is exponential in size, even prediction sets containing a small
fraction of all labels can be exponentially large. In the context of code
generation, we propose a solution that considers a restricted set of prediction
sets that can compactly be represented as partial programs, which are programs
with portions replaced with holes. Given a trained code generation model, our
algorithm leverages a programming language's abstract syntax tree to generate a
set of programs such that the correct program is in the set with
high-confidence. Valuable applications of our algorithm include a Codex-style
code generator with holes in uncertain parts of the generated code, which
provides a partial program with theoretical guarantees. We evaluate our
approach on PICARD (a T5 model for SQL semantic parsing) and Codex (a GPT model
for over a dozen programming languages, including Python), demonstrating that
our approach generates compact PAC prediction sets. This is the first research
contribution that generates PAC prediction sets for generative code models.
Related papers
- Conformal Structured Prediction [32.23920437534215]
We propose a general framework for conformal prediction in the structured prediction setting.
We show how our algorithm can be used to construct prediction sets that satisfy a desired coverage guarantee in several domains.
arXiv Detail & Related papers (2024-10-08T18:56:15Z) - Divide-and-Conquer Predictive Coding: a structured Bayesian inference algorithm [11.722226132995978]
We introduce a novel predictive coding algorithm for structured generative models, that we call divide-and-conquer predictive coding (D CPC)
D CPC performs maximum-likelihood updates of model parameters without sacrificing biological plausibility.
Empirically, DCPC achieves better numerical performance than competing algorithms and provides accurate inference in a number of problems not previously addressed with predictive coding.
arXiv Detail & Related papers (2024-08-11T17:29:03Z) - Uncertainty Quantification for Neurosymbolic Programs via Compositional Conformal Prediction [36.88661670156255]
Conformal prediction has emerged as a promising strategy for quantifying uncertainty in machine learning.
We propose a novel framework for adapting conformal prediction to neurosymbolic programs.
We evaluate our approach on programs that take MNIST and MS-COCO images as input.
arXiv Detail & Related papers (2024-05-24T20:15:53Z) - CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation.
CodeIP is a novel multi-bit watermarking technique that embeds additional information to preserve provenance details.
Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z) - Multi-Candidate Speculative Decoding [82.05519287513444]
Large language models have shown impressive capabilities across a variety of NLP tasks, yet their generating text autoregressively is time-consuming.
One way to speed them up is speculative decoding, which generates candidate segments from a fast draft model that is then verified in parallel by the target model.
This paper proposes sampling multiple candidates from a draft model and then organising them in batches for verification.
We design algorithms for efficient multi-candidate verification while maintaining the distribution of the target model.
arXiv Detail & Related papers (2024-01-12T17:15:23Z) - PAC Prediction Sets Under Label Shift [52.30074177997787]
Prediction sets capture uncertainty by predicting sets of labels rather than individual labels.
We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.
We evaluate our approach on five datasets.
arXiv Detail & Related papers (2023-10-19T17:57:57Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - Representing Partial Programs with Blended Abstract Semantics [62.20775388513027]
We introduce a technique for representing partially written programs in a program synthesis engine.
We learn an approximate execution model implemented as a modular neural network.
We show that these hybrid neuro-symbolic representations enable execution-guided synthesizers to use more powerful language constructs.
arXiv Detail & Related papers (2020-12-23T20:40:18Z) - Latent Programmer: Discrete Latent Codes for Program Synthesis [56.37993487589351]
In many sequence learning tasks, such as program synthesis and document summarization, a key problem is searching over a large space of possible output sequences.
We propose to learn representations of the outputs that are specifically meant for search: rich enough to specify the desired output but compact enough to make search more efficient.
We introduce the emphLatent Programmer, a program synthesis method that first predicts a discrete latent code from input/output examples, and then generates the program in the target language.
arXiv Detail & Related papers (2020-12-01T10:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.