Program Structure Aware Precondition Generation
- URL: http://arxiv.org/abs/2310.02154v2
- Date: Fri, 16 Aug 2024 19:22:29 GMT
- Title: Program Structure Aware Precondition Generation
- Authors: Elizabeth Dinella, Shuvendu Lahiri, Mayur Naik,
- Abstract summary: We introduce a novel approach for inferring natural preconditions from code.
Our innovation lies in leveraging the structure of a target method as a seed to infer a precondition through program transformations.
We present a dataset of 18k Java (method, precondition) pairs obtained by applying our framework to 87 real-world projects.
- Score: 8.797622429151861
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We introduce a novel approach for inferring natural preconditions from code. Our technique produces preconditions of high quality in terms of both correctness (modulo a test generator) and naturalness. Prior works generate preconditions from scratch through combinations of boolean predicates, but fall short in readability and ease of comprehension. Our innovation lies in, instead, leveraging the structure of a target method as a seed to infer a precondition through program transformations. Our evaluation shows that humans can more easily reason over preconditions inferred using our approach. Lastly, we instantiate our technique into a framework which can be applied at scale. We present a dataset of ~18k Java (method, precondition) pairs obtained by applying our framework to 87 real-world projects. We use this dataset to both evaluate our approach and draw useful insights for future research in precondition inference.
Related papers
- SpecMind: Cognitively Inspired, Interactive Multi-Turn Framework for Postcondition Inference [7.324314351910779]
SpecMind is a novel framework for postcondition generation that treats LLMs as interactive and exploratory reasoners.<n>Our empirical evaluation shows that SpecMind significantly outperforms state-of-the-art approaches in both accuracy and completeness of generated postconditions.
arXiv Detail & Related papers (2026-02-24T07:01:17Z) - Looking beyond the next token [75.00751370502168]
We argue that rearranging and processing the training data sequences can allow models to more accurately imitate the true data-generating process.
Our method naturally enables the generation of long-term goals at no additional cost.
arXiv Detail & Related papers (2025-04-15T16:09:06Z) - Bayesian Test-Time Adaptation for Vision-Language Models [51.93247610195295]
Test-time adaptation with pre-trained vision-language models, such as CLIP, aims to adapt the model to new, potentially out-of-distribution test data.
We propose a novel approach, textbfBayesian textbfClass textbfAdaptation (BCA), which in addition to continuously updating class embeddings to adapt likelihood, also uses the posterior of incoming samples to continuously update the prior for each class embedding.
arXiv Detail & Related papers (2025-03-12T10:42:11Z) - ClassInvGen: Class Invariant Synthesis using Large Language Models [11.374431160444676]
ClassInvGen is a method for co-generating executable class invariants and test inputs.
We show that ClassInvGen outperforms a pure LLM-based technique to generate specifications (from code)
We also demonstrate its applicability to real-world code by performing a case study on several classes within a widely used and high-integrity C++.
arXiv Detail & Related papers (2025-02-26T08:10:57Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - SAGA: Summarization-Guided Assert Statement Generation [34.51502565985728]
This paper presents a novel summarization-guided approach for automatically generating assert statements.
We leverage a pre-trained language model as the reference architecture and fine-tune it on the task of assert statement generation.
arXiv Detail & Related papers (2023-05-24T07:03:21Z) - Toward Trustworthy Neural Program Synthesis [6.3557174349423455]
We develop an approach to estimate the probability that a program sampled from a large language model is correct.
Given a natural language description of a programming problem, our method samples both candidate programs as well as candidate predicates specifying how the program should behave.
arXiv Detail & Related papers (2022-09-29T20:32:07Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - LOGEN: Few-shot Logical Knowledge-Conditioned Text Generation with
Self-training [76.90793623822866]
We propose a unified framework for logical knowledge-conditioned text generation in the few-shot setting.
Our approach leverages self-training and samples pseudo logical forms based on content and structure consistency.
arXiv Detail & Related papers (2021-12-02T16:49:41Z) - A Framework and Benchmarking Study for Counterfactual Generating Methods
on Tabular Data [0.0]
Counterfactual explanations are viewed as an effective way to explain machine learning predictions.
There are already dozens of algorithms aiming to generate such explanations.
benchmarking study and framework can help practitioners in determining which technique and building blocks most suit their context.
arXiv Detail & Related papers (2021-07-09T21:06:03Z) - Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing.
We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z) - Semantic Scaffolds for Pseudocode-to-Code Generation [47.09844589656143]
We propose a method for program generation based on semantic scaffolds, lightweight structures representing the high-level semantic and syntactic composition of a program.
By using semantic scaffolds during inference, we achieve a 10% absolute improvement in top-100 accuracy over the previous state-of-the-art.
arXiv Detail & Related papers (2020-05-12T17:10:13Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z) - Learning the Relation between Code Features and Code Transforms with
Structured Prediction [13.62633524166298]
We present the first approach for structurally predicting code transforms at the level of AST nodes using conditional random fields (CRFs)
Our approach first learns offline a probabilistic model that captures how certain code transforms are applied to certain AST nodes, and then uses the learned model to predict transforms for arbitrary new, unseen code snippets.
arXiv Detail & Related papers (2019-07-22T12:42:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.