3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers
- URL: http://arxiv.org/abs/2404.10362v2
- Date: Mon, 6 May 2024 23:21:57 GMT
- Title: 3DGen: AI-Assisted Generation of Provably Correct Binary Format Parsers
- Authors: Sarah Fakhoury, Markus Kuppe, Shuvendu K. Lahiri, Tahina Ramananandro, Nikhil Swamy,
- Abstract summary: 3DGen is a framework that makes use of AI agents to transform mixed informal input into format specifications in a language called 3D.
3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C.
- Score: 5.102523342662388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improper parsing of attacker-controlled input is a leading source of software security vulnerabilities, especially when programmers transcribe informal format descriptions in RFCs into efficient parsing logic in low-level, memory unsafe languages. Several researchers have proposed formal specification languages for data formats from which efficient code can be extracted. However, distilling informal requirements into formal specifications is challenging and, despite their benefits, new, formal languages are hard for people to learn and use. In this work, we present 3DGen, a framework that makes use of AI agents to transform mixed informal input, including natural language documents (i.e., RFCs) and example inputs into format specifications in a language called 3D. To support humans in understanding and trusting the generated specifications, 3DGen uses symbolic methods to also synthesize test inputs that can be validated against an external oracle. Symbolic test generation also helps in distinguishing multiple plausible solutions. Through a process of repeated refinement, 3DGen produces a 3D specification that conforms to a test suite, and which yields safe, efficient, provably correct, parsing code in C. We have evaluated 3DGen on 20 Internet standard formats, demonstrating the potential for AI-agents to produce formally verified C code at a non-trivial scale. A key enabler is the use of a domain-specific language to limit AI outputs to a class for which automated, symbolic analysis is tractable.
Related papers
- Evaluating LLM-driven User-Intent Formalization for Verification-Aware Languages [6.0608817611709735]
Dafny and F* provide means to formally specify and prove properties of programs.
There is no algorithmic way of ensuring the correctness of the user-intent formalization for programs.
arXiv Detail & Related papers (2024-06-14T06:52:08Z) - Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs [57.27982780697922]
Large language models have demonstrated exceptional capability in natural language understanding and generation.
However, their generation speed is limited by the inherently sequential nature of their decoding process.
This paper introduces Lexical Unit Decoding, a novel decoding methodology implemented in a data-driven manner.
arXiv Detail & Related papers (2024-05-24T04:35:13Z) - CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [59.32609948217718]
We present CodeIP, a new watermarking technique for Large Language Models (LLMs)-based code generation.
CodeIP enables the insertion of multi-bit information while preserving the semantics of the generated code.
arXiv Detail & Related papers (2024-04-24T04:25:04Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.
FoC-BinLLM outperforms ChatGPT by 14.61% on the ROUGE-L score.
FoC-Sim outperforms the previous best methods with a 52% higher Recall@1.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Trusta: Reasoning about Assurance Cases with Formal Methods and Large
Language Models [4.005483185111992]
Trustworthiness Derivation Tree Analyzer (Trusta) is a desktop application designed to automatically construct and verify TDTs.
It has a built-in Prolog interpreter in its backend, and is supported by the constraint solvers Z3 and MONA.
Trusta can extract formal constraints from text in natural languages, facilitating an easier interpretation and validation process.
arXiv Detail & Related papers (2023-09-22T15:42:43Z) - From Ambiguity to Explicitness: NLP-Assisted 5G Specification
Abstraction for Formal Analysis [5.526122280732959]
We introduce a two-step pipeline that uses NLP tools to construct data and then uses constructed data to extract identifiers and formal properties.
Our results of the optimal model reach valid accuracy of 39% for extraction and 42% for formal properties identifier predictions.
arXiv Detail & Related papers (2023-08-07T03:37:31Z) - Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning [27.59524153097858]
grammar-constrained decoding (GCD) can be used to control the generation of large language models (LMs)
GCD can serve as a unified framework for structured NLP tasks in general.
We show that grammar-constrained LMs substantially outperform unconstrained LMs or even beat task-specific finetuned models.
arXiv Detail & Related papers (2023-05-23T11:54:37Z) - Detecting Text Formality: A Study of Text Classification Approaches [78.11745751651708]
This work proposes the first to our knowledge systematic study of formality detection methods based on statistical, neural-based, and Transformer-based machine learning methods.
We conducted three types of experiments -- monolingual, multilingual, and cross-lingual.
The study shows the overcome of Char BiLSTM model over Transformer-based ones for the monolingual and multilingual formality classification task.
arXiv Detail & Related papers (2022-04-19T16:23:07Z) - CGEMs: A Metric Model for Automatic Code Generation using GPT-3 [0.0]
This work aims to validate AI-generated content using theoretical proofs or by using Monte-Carlo simulation methods.
In this case, we use the latter approach to test/validate a statistically significant number of samples.
The various metrics that are garnered in this work to support the evaluation of generated code are as follows: Compilation, NL description to logic conversion, number of edits needed, some of the commonly used static-code metrics and NLP metrics.
arXiv Detail & Related papers (2021-08-23T13:28:57Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.