Related papers: PAC Prediction Sets for Large Language Models of Code

PAC Prediction Sets for Large Language Models of Code

URL: http://arxiv.org/abs/2302.08703v2
Date: Wed, 21 Jun 2023 02:35:39 GMT
Title: PAC Prediction Sets for Large Language Models of Code
Authors: Adam Khakhar, Stephen Mell, Osbert Bastani
Abstract summary: We propose a solution that considers a restricted set of prediction sets that can compactly be represented as partial programs. This is the first research contribution that generates PAC prediction sets for generative code models.
Score: 19.071829387911276
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prediction sets have recently been shown to be a promising strategy for quantifying the uncertainty of deep neural networks in a way that provides theoretical guarantees. However, existing techniques have largely targeted settings where the space of labels is simple, so prediction sets can be arbitrary subsets of labels. For structured prediction problems where the space of labels is exponential in size, even prediction sets containing a small fraction of all labels can be exponentially large. In the context of code generation, we propose a solution that considers a restricted set of prediction sets that can compactly be represented as partial programs, which are programs with portions replaced with holes. Given a trained code generation model, our algorithm leverages a programming language's abstract syntax tree to generate a set of programs such that the correct program is in the set with high-confidence. Valuable applications of our algorithm include a Codex-style code generator with holes in uncertain parts of the generated code, which provides a partial program with theoretical guarantees. We evaluate our approach on PICARD (a T5 model for SQL semantic parsing) and Codex (a GPT model for over a dozen programming languages, including Python), demonstrating that our approach generates compact PAC prediction sets. This is the first research contribution that generates PAC prediction sets for generative code models.

Related papers

Conformal Prediction Sets for Deep Generative Models via Reduction to Conformal Regression [7.972619160216404]
We consider the problem of generating valid and small prediction sets from a black-box deep generative model for a given input. We develop a simple and effective conformal inference algorithm referred to as Generative Prediction Sets (GPS) The key insight behind GPS is to exploit the inherent structure within the distribution over the minimum number of samples needed to obtain an admissible output.
arXiv Detail & Related papers (2025-03-13T16:16:23Z)
Conformal Structured Prediction [32.23920437534215]
We propose a general framework for conformal prediction in the structured prediction setting. We show how our algorithm can be used to construct prediction sets that satisfy a desired coverage guarantee in several domains.
arXiv Detail & Related papers (2024-10-08T18:56:15Z)
Divide-and-Conquer Predictive Coding: a structured Bayesian inference algorithm [11.722226132995978]
We introduce a novel predictive coding algorithm for structured generative models, that we call divide-and-conquer predictive coding (D CPC) D CPC performs maximum-likelihood updates of model parameters without sacrificing biological plausibility. Empirically, DCPC achieves better numerical performance than competing algorithms and provides accurate inference in a number of problems not previously addressed with predictive coding.
arXiv Detail & Related papers (2024-08-11T17:29:03Z)
Uncertainty Quantification for Neurosymbolic Programs via Compositional Conformal Prediction [36.88661670156255]
Conformal prediction has emerged as a promising strategy for quantifying uncertainty in machine learning. We propose a novel framework for adapting conformal prediction to neurosymbolic programs. We evaluate our approach on programs that take MNIST and MS-COCO images as input.
arXiv Detail & Related papers (2024-05-24T20:15:53Z)
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation. CodeIP is a novel multi-bit watermarking technique that embeds additional information to preserve provenance details. Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z)
Multi-Candidate Speculative Decoding [82.05519287513444]
Large language models have shown impressive capabilities across a variety of NLP tasks, yet their generating text autoregressively is time-consuming. One way to speed them up is speculative decoding, which generates candidate segments from a fast draft model that is then verified in parallel by the target model. This paper proposes sampling multiple candidates from a draft model and then organising them in batches for verification. We design algorithms for efficient multi-candidate verification while maintaining the distribution of the target model.
arXiv Detail & Related papers (2024-01-12T17:15:23Z)
PAC Prediction Sets Under Label Shift [52.30074177997787]
Prediction sets capture uncertainty by predicting sets of labels rather than individual labels. We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting. We evaluate our approach on five datasets.
arXiv Detail & Related papers (2023-10-19T17:57:57Z)
Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it. Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z)
Representing Partial Programs with Blended Abstract Semantics [62.20775388513027]
We introduce a technique for representing partially written programs in a program synthesis engine. We learn an approximate execution model implemented as a modular neural network. We show that these hybrid neuro-symbolic representations enable execution-guided synthesizers to use more powerful language constructs.
arXiv Detail & Related papers (2020-12-23T20:40:18Z)
Latent Programmer: Discrete Latent Codes for Program Synthesis [56.37993487589351]
In many sequence learning tasks, such as program synthesis and document summarization, a key problem is searching over a large space of possible output sequences. We propose to learn representations of the outputs that are specifically meant for search: rich enough to specify the desired output but compact enough to make search more efficient. We introduce the emphLatent Programmer, a program synthesis method that first predicts a discrete latent code from input/output examples, and then generates the program in the target language.
arXiv Detail & Related papers (2020-12-01T10:11:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.