Related papers: Cassandra: Efficient Enforcement of Sequential Execution for Cryptographic Programs (Extended Version)

Cassandra: Efficient Enforcement of Sequential Execution for Cryptographic Programs (Extended Version)

URL: http://arxiv.org/abs/2406.04290v2
Date: Tue, 20 May 2025 15:22:28 GMT
Title: Cassandra: Efficient Enforcement of Sequential Execution for Cryptographic Programs (Extended Version)
Authors: Ali Hajiabadi, Trevor E. Carlson,
Abstract summary: Constant-time programming is a widely deployed approach to harden cryptographic programs against side channel attacks.<n>Modern processors often violate the underlying assumptions of standard constant-time policies by transiently executing unintended paths of the program.<n>We propose Cassandra, a novel hardware/software mechanism to enforce sequential execution for constant-time cryptographic code.
Score: 3.34371579019566
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Constant-time programming is a widely deployed approach to harden cryptographic programs against side channel attacks. However, modern processors often violate the underlying assumptions of standard constant-time policies by transiently executing unintended paths of the program. Despite many solutions proposed, addressing control flow misspeculations in an efficient way without losing performance is an open problem. In this work, we propose Cassandra, a novel hardware/software mechanism to enforce sequential execution for constant-time cryptographic code in a highly efficient manner. Cassandra explores the radical design point of disabling the branch predictor and recording-and-replaying sequential control flow of the program. Two key insights that enable our design are that (1) the sequential control flow of a constant-time program is mostly static over different runs, and (2) cryptographic programs are loop-intensive and their control flow patterns repeat in a highly compressible way. These insights allow us to perform an upfront branch analysis that significantly compresses control flow traces. We add a small component to a typical processor design, the Branch Trace Unit, to store compressed traces and determine fetch redirections according to the sequential model of the program. Despite providing a strong security guarantee, Cassandra counterintuitively provides an average 1.85% speedup compared to an unsafe baseline processor, mainly due to enforcing near-perfect fetch redirections.

Related papers

Multipole Attention for Efficient Long Context Reasoning [64.94673641704289]
Large Reasoning Models (LRMs) have shown promising accuracy improvements on complex problem-solving tasks.<n>LRMs need to generate long chain-of-thought reasoning in order to think before answering.<n>We introduce Multipole Attention, which accelerates autoregressive reasoning by only computing exact attention for the most important tokens.
arXiv Detail & Related papers (2025-06-16T03:00:40Z)
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism [17.858104076062897]
Large language models (LLMs) are increasingly used for long-content generation.<n>We propose AdaDecode, which accelerates decoding without requiring auxiliary models or changes to the original model parameters.<n>AdaDecode consistently achieves superior decoding throughput with up to 1.73x speedup.
arXiv Detail & Related papers (2025-06-04T08:32:30Z)
Fast correlated decoding of transversal logical algorithms [67.01652927671279]
Quantum error correction (QEC) is required for large-scale computation, but incurs a significant resource overhead.<n>Recent advances have shown that by jointly decoding logical qubits in algorithms composed of logical gates, the number of syndrome extraction rounds can be reduced.<n>Here, we reform the problem of decoding circuits by directly decoding relevant logical operator products as they propagate through the circuit.
arXiv Detail & Related papers (2025-05-19T18:00:00Z)
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting [59.57151419673759]
Speculative decoding presents a draft-then-verify framework that reduces generation latency while maintaining output distribution fidelity. We propose DuoDecoding, a novel approach that strategically deploys the draft and target models on the CPU and GPU respectively. Our method incorporates a hardware-aware optimal draft budget to minimize idle times and employs dynamic multi-sequence drafting to enhance draft quality.
arXiv Detail & Related papers (2025-03-02T08:27:48Z)
ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages. This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z)
Thetacrypt: A Distributed Service for Threshold Cryptography [0.0]
Thetacrypt is a versatile library for integrating many threshold schemes into one language. It offers a way to easily build distributed systems using threshold cryptography and is agnostic to their implementation. The library currently includes six cryptographic schemes that span ciphers, signatures, and randomness generation.
arXiv Detail & Related papers (2025-02-05T15:03:59Z)
Enhanced Min-Sum Decoding of Quantum Codes Using Previous Iteration Dynamics [3.6048794343841766]
We propose a novel message-passing decoding approach that leverages the degeneracy of quantum low-density parity-check codes. Our focus is on two-block Calderbank-Shor-Steane (CSS) codes, which are composed of symmetric stabilizers.
arXiv Detail & Related papers (2025-01-09T07:28:26Z)
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [56.66677293607114]
We propose Code-as-Monitor (CaM) for both open-set reactive and proactive failure detection. To enhance the accuracy and efficiency of monitoring, we introduce constraint elements that abstract constraint-related entities. Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances.
arXiv Detail & Related papers (2024-12-05T18:58:27Z)
Libra: Architectural Support For Principled, Secure And Efficient Balanced Execution On High-End Processors (Extended Version) [9.404954747748523]
Control-flow leakage (CFL) attacks enable an attacker to expose control-flow decisions of a victim program via side-channel observations. Linearization has been widely believed to be the only effective countermeasure against CFL attacks. We propose Libra, a generic and principled hardware-software codesign to efficiently address CFL on high-end processors.
arXiv Detail & Related papers (2024-09-05T17:56:19Z)
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion [59.17158389902231]
Speculative decoding has emerged as a widely adopted method to accelerate large language model inference. This paper proposes an adaptation of speculative decoding which uses discrete diffusion models to generate draft sequences.
arXiv Detail & Related papers (2024-08-10T21:24:25Z)
The Latency Price of Threshold Cryptosystem in Blockchains [52.359230560289745]
We study the interplay between threshold cryptography and a class of blockchains that use Byzantine-fault tolerant (BFT) consensus protocols. Existing approaches for threshold cryptosystems introduce a latency overhead of at least one message delay for running the threshold cryptographic protocol. We propose a mechanism to eliminate this overhead for blockchain-native threshold cryptosystems with tight thresholds.
arXiv Detail & Related papers (2024-07-16T20:53:04Z)
Parsimonious Optimal Dynamic Partial Order Reduction [1.5029560229270196]
We present Parsimonious-OPtimal DPOR (POP), an optimal DPOR algorithm for analyzing multi-threaded programs under sequential consistency. POP combines several novel algorithmic techniques, including (i) a parsimonious race reversal strategy, which avoids multiple reversals of the same race. Our implementation in Nidhugg shows that these techniques can significantly speed up the analysis of concurrent programs, and do so with low memory consumption.
arXiv Detail & Related papers (2024-05-18T00:07:26Z)
Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass. In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z)
Non-autoregressive Sequence-to-Sequence Vision-Language Models [63.77614880533488]
We propose a parallel decoding sequence-to-sequence vision-language model that marginalizes over multiple inference paths in the decoder. The model achieves performance on-par with its state-of-the-art autoregressive counterpart, but is faster at inference time.
arXiv Detail & Related papers (2024-03-04T17:34:59Z)
Secure Synthesis of Distributed Cryptographic Applications (Technical Report) [1.9707603524984119]
We advocate using secure program partitioning to synthesize cryptographic applications. This approach is promising, but formal results for the security of such compilers are limited in scope. We develop a compiler security proof that handles subtleties essential for robust, efficient applications.
arXiv Detail & Related papers (2024-01-06T02:57:44Z)
Blockchain Smart Contract Threat Detection Technology Based on Symbolic Execution [0.0]
Reentrancy vulnerability, which is hidden and complex, poses a great threat to smart contracts. In this paper, we propose a smart contract threat detection technology based on symbolic execution. The experimental results show that this method significantly increases both detection efficiency and accuracy.
arXiv Detail & Related papers (2023-12-24T03:27:03Z)
Code Polymorphism Meets Code Encryption: Confidentiality and Side-Channel Protection of Software Components [0.0]
PolEn is a toolchain and a processor architecturethat combine countermeasures in order to provide an effective mitigation of side-channel attacks. Code encryption is supported by a processor extension such that machineinstructions are only decrypted inside the CPU. Code polymorphism is implemented by software means. It regularly changes the observablebehaviour of the program, making it unpredictable for an attacker.
arXiv Detail & Related papers (2023-10-11T09:16:10Z)
Citadel: Real-World Hardware-Software Contracts for Secure Enclaves Through Microarchitectural Isolation and Controlled Speculation [8.414722884952525]
Hardware isolation primitives such as secure enclaves aim to protect programs, but remain vulnerable to transient execution attacks. This paper advocates for processors to incorporate microarchitectural isolation primitives and mechanisms for controlled speculation. We introduce two mechanisms to securely share memory between an enclave and an untrusted OS in an out-of-order processor.
arXiv Detail & Related papers (2023-06-26T17:51:23Z)
Modular decoding: parallelizable real-time decoding for quantum computers [55.41644538483948]
Real-time quantum computation will require decoding algorithms capable of extracting logical outcomes from a stream of data generated by noisy quantum hardware. We propose modular decoding, an approach capable of addressing this challenge with minimal additional communication and without sacrificing decoding accuracy. We introduce the edge-vertex decomposition, a concrete instance of modular decoding for lattice-surgery style fault-tolerant blocks.
arXiv Detail & Related papers (2023-03-08T19:26:10Z)
NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences. The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation. In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z)
Securing Optimized Code Against Power Side Channels [1.589424114251205]
Security engineers often sacrifice code efficiency by turning off compiler optimization and/or performing local, post-compilation transformations. This paper proposes SecConCG, a constraint-based compiler approach that generates optimized yet secure code.
arXiv Detail & Related papers (2022-07-06T12:06:28Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.