Nanopass Back-Translation of Call-Return Trees for Mechanized Secure Compilation Proofs
- URL: http://arxiv.org/abs/2503.19609v2
- Date: Fri, 11 Apr 2025 13:02:37 GMT
- Title: Nanopass Back-Translation of Call-Return Trees for Mechanized Secure Compilation Proofs
- Authors: Jérémy Thibault, Joseph Lenormand, Catalin Hritcu,
- Abstract summary: Researchers aim to build secure compilation chains enforcing that if there is no attack a source context can mount against a source program.<n>For any attack a target context mounts against the compiled program one has to exhibit a source context mounting the same attack against the source program.<n>We describe a novel back-translation technique, which results in simpler proofs that can be more easily mechanized in a proof assistant.
- Score: 0.07499722271664144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Researchers aim to build secure compilation chains enforcing that if there is no attack a source context can mount against a source program then there is also no attack an adversarial target context can mount against the compiled program. Proving that these compilation chains are secure is, however, challenging, and involves a non-trivial back-translation step: for any attack a target context mounts against the compiled program one has to exhibit a source context mounting the same attack against the source program. We describe a novel back-translation technique, which results in simpler proofs that can be more easily mechanized in a proof assistant. Given a finite set of finite trace prefixes, capturing the interaction recorded during an attack between a target context and the compiled program, we build a call-return tree that we back-translate into a source context producing the same trace prefixes. We use state in the generated source context to record the current location in the call-return tree. The back-translation is done in several small steps, each adding to the tree new information describing how the location should change depending on how the context regains control. To prove this back-translation correct we give semantics to every intermediate call-return tree language, using ghost state to store information and explicitly enforce execution invariants. We prove several small forward simulations, basically seeing the back-translation as a verified nanopass compiler. Thanks to this modular structure, we are able to mechanize this complex back-translation and its correctness proof in the Rocq prover without too much effort.
Related papers
- ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis [8.361424157571468]
Syzygy is an automated approach to translate C to safe Rust.
This is the largest automated and test-validated C to safe Rust code translation achieved so far.
arXiv Detail & Related papers (2024-12-18T18:55:46Z) - SNIP: Speculative Execution and Non-Interference Preservation for Compiler Transformations [0.15800607910450126]
We address the problem of preserving non-interference across compiler transformations under speculative semantics.
We develop a proof method that ensures the preservation uniformly across all source programs.
arXiv Detail & Related papers (2024-07-21T07:30:30Z) - Unsupervised Sign Language Translation and Generation [72.01216288379072]
We introduce an unsupervised sign language translation and generation network (USLNet)
USLNet learns from abundant single-modality (text and video) data without parallel sign language data.
We propose a sliding window method to address the issues of aligning variable-length text with video sequences.
arXiv Detail & Related papers (2024-02-12T15:39:05Z) - CRIL: A Concurrent Reversible Intermediate Language [0.0]
We present a reversible intermediate language with composing for translating a high-level concurrent programming language to another lower-level concurrent programming language, keeping reversibility.
We propose CRIL as an extension of RIL used by Mogensen for a functional reversible language, incorporating a multi-thread process invocation and the synchronization primitives based on the P-V operations.
arXiv Detail & Related papers (2023-09-13T20:52:54Z) - Outline, Then Details: Syntactically Guided Coarse-To-Fine Code
Generation [61.50286000143233]
ChainCoder is a program synthesis language model that generates Python code progressively.
A tailored transformer architecture is leveraged to jointly encode the natural language descriptions and syntactically aligned I/O data samples.
arXiv Detail & Related papers (2023-04-28T01:47:09Z) - Summarize and Generate to Back-translate: Unsupervised Translation of
Programming Languages [86.08359401867577]
Back-translation is widely known for its effectiveness for neural machine translation when little to no parallel data is available.
We propose performing back-translation via code summarization and generation.
We show that our proposed approach performs competitively with state-of-the-art methods.
arXiv Detail & Related papers (2022-05-23T08:20:41Z) - Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf
Language Models [63.808843089941405]
Large pretrained LanguageModels (LMs) generate text with remarkable quality, but only sequentially from left to right.
We present Reflective Decoding, a novel unsupervised algorithm that allows for direct application of unidirectional LMs to non-sequential tasks.
Our 2-step approach requires no supervision or even parallel corpora, only two off-the-shelf pretrained LMs in opposite directions.
arXiv Detail & Related papers (2020-10-16T18:02:07Z) - Contrastive Code Representation Learning [95.86686147053958]
We show that the popular reconstruction-based BERT model is sensitive to source code edits, even when the edits preserve semantics.
We propose ContraCode: a contrastive pre-training task that learns code functionality, not form.
arXiv Detail & Related papers (2020-07-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.