Related papers: 3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers

3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers

URL: http://arxiv.org/abs/2404.10362v2
Date: Mon, 6 May 2024 23:21:57 GMT
Title: 3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers
Authors: Sarah Fakhoury, Markus Kuppe, Shuvendu K. Lahiri, Tahina Ramananandro, Nikhil Swamy,
Abstract summary: 3DGen is a framework that makes use of AI agents to transform mixed informal input into format specifications in a language called 3D. 3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C.
Score: 5.102523342662388
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Improper parsing of attacker-controlled input is a leading source of software security vulnerabilities, especially when programmers transcribe informal format descriptions in RFCs into efficient parsing logic in low-level, memory unsafe languages. Several researchers have proposed formal specification languages for data formats from which efficient code can be extracted. However, distilling informal requirements into formal specifications is challenging and, despite their benefits, new, formal languages are hard for people to learn and use. In this work, we present 3DGen, a framework that makes use of AI agents to transform mixed informal input, including natural language documents (i.e., RFCs) and example inputs into format specifications in a language called 3D. To support humans in understanding and trusting the generated specifications, 3DGen uses symbolic methods to also synthesize test inputs that can be validated against an external oracle. Symbolic test generation also helps in distinguishing multiple plausible solutions. Through a process of repeated refinement, 3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C. We have evaluated 3DGen on 20 Internet standard formats, demonstrating the potential for AI-agents to produce formally verified C code at a non-trivial scale. A key enabler is the use of a domain-specific language to limit AI outputs to a class for which automated, symbolic analysis is tractable.

Related papers

Type-Constrained Code Generation with Language Models [51.03439021895432]
Large language models (LLMs) produce uncompilable output because their next-token inference procedure does not model formal aspects of code. We introduce a type-constrained decoding approach that leverages type systems to guide code generation. Our approach reduces compilation errors by more than half and increases functional correctness in code synthesis, translation, and repair tasks.
arXiv Detail & Related papers (2025-04-12T15:03:00Z)
ClassInvGen: Class Invariant Synthesis using Large Language Models [11.374431160444676]
ClassInvGen is a method for co-generating executable class invariants and test inputs. We show that ClassInvGen outperforms a pure LLM-based technique to generate specifications (from code) We also demonstrate its applicability to real-world code by performing a case study on several classes within a widely used and high-integrity C++.
arXiv Detail & Related papers (2025-02-26T08:10:57Z)
Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers. It is common to instead use proxy tasks that are similar in only an informal sense. We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
Automatic deductive coding in discourse analysis: an application of large language models in learning analytics [5.606202114848633]
The emergence of large language models such as GPT has opened a new avenue for automatic deductive coding. We employed three different classification methods driven by different artificial intelligence technologies. We found that GPT with prompt engineering outperformed the other two methods on both datasets with limited number of training samples.
arXiv Detail & Related papers (2024-10-02T05:04:06Z)
Evaluating LLM-driven User-Intent Formalization for Verification-Aware Languages [6.0608817611709735]
We propose a metric for evaluating the quality of specifications for verification-aware languages. We show that our metric agrees closely on a human-labeled dataset of Dafny specifications for the popular MBPP code-generation benchmark. We also outline formal verification challenges that need to be addressed to apply the technique more widely.
arXiv Detail & Related papers (2024-06-14T06:52:08Z)
Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs [57.27982780697922]
Large language models have demonstrated exceptional capability in natural language understanding and generation. However, their generation speed is limited by the inherently sequential nature of their decoding process. This paper introduces Lexical Unit Decoding, a novel decoding methodology implemented in a data-driven manner.
arXiv Detail & Related papers (2024-05-24T04:35:13Z)
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries. We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language. We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z)
Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z)
From Ambiguity to Explicitness: NLP-Assisted 5G Specification Abstraction for Formal Analysis [5.526122280732959]
We introduce a two-step pipeline that uses NLP tools to construct data and then uses constructed data to extract identifiers and formal properties. Our results of the optimal model reach valid accuracy of 39% for extraction and 42% for formal properties identifier predictions.
arXiv Detail & Related papers (2023-08-07T03:37:31Z)
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning [27.59524153097858]
grammar-constrained decoding (GCD) can be used to control the generation of large language models (LMs) GCD can serve as a unified framework for structured NLP tasks in general. We show that grammar-constrained LMs substantially outperform unconstrained LMs or even beat task-specific finetuned models.
arXiv Detail & Related papers (2023-05-23T11:54:37Z)
Detecting Text Formality: A Study of Text Classification Approaches [78.11745751651708]
This work proposes the first to our knowledge systematic study of formality detection methods based on statistical, neural-based, and Transformer-based machine learning methods. We conducted three types of experiments -- monolingual, multilingual, and cross-lingual. The study shows the overcome of Char BiLSTM model over Transformer-based ones for the monolingual and multilingual formality classification task.
arXiv Detail & Related papers (2022-04-19T16:23:07Z)
CGEMs: A Metric Model for Automatic Code Generation using GPT-3 [0.0]
This work aims to validate AI-generated content using theoretical proofs or by using Monte-Carlo simulation methods. In this case, we use the latter approach to test/validate a statistically significant number of samples. The various metrics that are garnered in this work to support the evaluation of generated code are as follows: Compilation, NL description to logic conversion, number of edits needed, some of the commonly used static-code metrics and NLP metrics.
arXiv Detail & Related papers (2021-08-23T13:28:57Z)
Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages. We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data. Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.