Securing Cryptographic Software via Typed Assembly Language (Extended Version)
- URL: http://arxiv.org/abs/2509.08727v1
- Date: Wed, 10 Sep 2025 16:17:31 GMT
- Title: Securing Cryptographic Software via Typed Assembly Language (Extended Version)
- Authors: Shixin Song, Tingzhen Dong, Kosi Nwabueze, Julian Zanders, Andres Erbsen, Adam Chlipala, Mengjia Yan,
- Abstract summary: This paper introduces SecSep, a framework that rewrites assembly programs so that they partition secret and public data on the stack.<n>By moving from the source-code level to assembly rewriting, SecSep is able to address limitations of prior work.<n>Key innovation of our methodology is a new variant of typed assembly language (TAL), Octal, which allows us to address this challenge.
- Score: 3.394433117563263
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Authors of cryptographic software are well aware that their code should not leak secrets through its timing behavior, and, until 2018, they believed that following industry-standard constant-time coding guidelines was sufficient. However, the revelation of the Spectre family of speculative execution attacks injected new complexities. To block speculative attacks, prior work has proposed annotating the program's source code to mark secret data, with hardware using this information to decide when to speculate (i.e., when only public values are involved) or not (when secrets are in play). While these solutions are able to track secret information stored on the heap, they suffer from limitations that prevent them from correctly tracking secrets on the stack, at a cost in performance. This paper introduces SecSep, a transformation framework that rewrites assembly programs so that they partition secret and public data on the stack. By moving from the source-code level to assembly rewriting, SecSep is able to address limitations of prior work. The key challenge in performing this assembly rewriting stems from the loss of semantic information through the lengthy compilation process. The key innovation of our methodology is a new variant of typed assembly language (TAL), Octal, which allows us to address this challenge. Assembly rewriting is driven by compile-time inference within Octal. We apply our technique to cryptographic programs and demonstrate that it enables secure speculation efficiently, incurring a low average overhead of $1.2\%$.
Related papers
- Context-Guided Decompilation: A Step Towards Re-executability [50.71992919223209]
Binary decompilation plays an important role in software security analysis, reverse engineering and malware understanding.<n>Recent advances in large language models (LLMs) have enabled neural decompilation, but the generated code is typically only semantically plausible.<n>We propose ICL4Decomp, a hybrid decompilation framework that leverages in-context learning (ICL) to guide LLMs toward generating re-executable source code.
arXiv Detail & Related papers (2025-11-03T17:21:39Z) - Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - TFHE-Coder: Evaluating LLM-agentic Fully Homomorphic Encryption Code Generation [10.597643264309415]
Homomorphic Encryption over the torus (TFHE) enables encrypted computation on data without decryption.<n>Despite its potential in privacy preserving machine learning, secure multi party computation, private blockchain transactions, and secure medical diagnostics, its adoption remains limited due to cryptographic complexity and usability challenges.<n>This work establishes the first benchmark for TFHE code generation, demonstrating how LLMs, when augmented with domain-specific feedback, can bridge the expertise gap in FHE code generation.
arXiv Detail & Related papers (2025-03-15T17:57:44Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - Breaking Bad: How Compilers Break Constant-Time Implementations [8.771587132463535]
We investigate how compilers break protections introduced by defensive programming techniques.<n>We run a large-scale experiment to see if such compiler-induced issues manifest in state-of-the-art cryptographic libraries.<n>Our study reveals that several compiler-induced secret-dependent operations occur within some of the most highly regarded cryptographic libraries.
arXiv Detail & Related papers (2024-10-17T12:34:02Z) - Decoding at the Speed of Thought: Harnessing Parallel Decoding of Lexical Units for LLMs [57.27982780697922]
Large language models have demonstrated exceptional capability in natural language understanding and generation.
However, their generation speed is limited by the inherently sequential nature of their decoding process.
This paper introduces Lexical Unit Decoding, a novel decoding methodology implemented in a data-driven manner.
arXiv Detail & Related papers (2024-05-24T04:35:13Z) - CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation.<n>CodeIP is a novel multi-bit watermarking technique that inserts additional information to preserve provenance details.<n>Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z) - Robust Constant-Time Cryptography [11.064951083714883]
Constant-time is considered the security standard for cryptographic code.
Constant-time relies on the entirety of the code base being constant-time.
Constant-time requires memory safety of all the running code.
arXiv Detail & Related papers (2023-11-10T02:35:46Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - CryptOpt: Verified Compilation with Randomized Program Search for
Cryptographic Primitives (full version) [12.790826917588575]
cryptography has been an exception, where many performance-critical routines have been written directly in assembly.
We present CryptOpt, the first compilation pipeline that specializes high-level cryptographic functional programs into assembly code significantly faster than what GCC or Clang produce.
On the formal-verification side, we connect to the FiatOpt framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker.
arXiv Detail & Related papers (2022-11-19T11:07:39Z) - Securing Optimized Code Against Power Side Channels [1.589424114251205]
Security engineers often sacrifice code efficiency by turning off compiler optimization and/or performing local, post-compilation transformations.
This paper proposes SecConCG, a constraint-based compiler approach that generates optimized yet secure code.
arXiv Detail & Related papers (2022-07-06T12:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.