Related papers: Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval

Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval

URL: http://arxiv.org/abs/2512.11690v1
Date: Fri, 12 Dec 2025 16:12:02 GMT
Title: Leveraging FPGAs for Homomorphic Matrix-Vector Multiplication in Oblivious Message Retrieval
Authors: Grant Bosworth, Keewoo Lee, Sunwoong Kim,
Abstract summary: A plausible approach to protecting metadata is to have senders post messages on a public bulletin board and receivers scan it for relevant messages.<n>Oblivious message retrieval (OMR) leverages homomorphic encryption (HE) to improve user experience in this solution by delegating the scan to a resource-rich server.<n>This paper proposes a hardware architecture to accelerate the matrix-vector multiplication algorithm.
Score: 2.8190885435355857
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While end-to-end encryption protects the content of messages, it does not secure metadata, which exposes sender and receiver information through traffic analysis. A plausible approach to protecting this metadata is to have senders post encrypted messages on a public bulletin board and receivers scan it for relevant messages. Oblivious message retrieval (OMR) leverages homomorphic encryption (HE) to improve user experience in this solution by delegating the scan to a resource-rich server while preserving privacy. A key process in OMR is the homomorphic detection of pertinent messages for the receiver from the bulletin board. It relies on a specialized matrix-vector multiplication algorithm, which involves extensive multiplications between ciphertext vectors and plaintext matrices, as well as homomorphic rotations. The computationally intensive nature of this process limits the practicality of OMR. To address this challenge, this paper proposes a hardware architecture to accelerate the matrix-vector multiplication algorithm. The building homomorphic operators in this algorithm are implemented using high-level synthesis, with design parameters for different parallelism levels. These operators are then deployed on a field-programmable gate array platform using an efficient design space exploration strategy to accelerate homomorphic matrix-vector multiplication. Compared to a software implementation, the proposed hardware accelerator achieves a 13.86x speedup.

Related papers

Block encoding of sparse matrices with a periodic diagonal structure [67.45502291821956]
We provide an explicit quantum circuit for block encoding a sparse matrix with a periodic diagonal structure.<n>Various applications for the presented methodology are discussed in the context of solving differential problems.
arXiv Detail & Related papers (2026-02-11T07:24:33Z)
Decryption Through Polynomial Ambiguity: Noise-Enhanced High-Memory Convolutional Codes for Post-Quantum Cryptography [0.0]
We present a novel approach to post-quantum cryptography that employs directed decryption of noise-enhanced high-memory convolutional codes.<n>The proposed generates random-like generator matrices that effectively conceal and resist structural attacks.
arXiv Detail & Related papers (2025-12-02T14:30:03Z)
Fast correlated decoding of transversal logical algorithms [67.01652927671279]
Quantum error correction (QEC) is required for large-scale computation, but incurs a significant resource overhead.<n>Recent advances have shown that by jointly decoding logical qubits in algorithms composed of logical gates, the number of syndrome extraction rounds can be reduced.<n>Here, we reform the problem of decoding circuits by directly decoding relevant logical operator products as they propagate through the circuit.
arXiv Detail & Related papers (2025-05-19T18:00:00Z)
Compile-Time Fully Homomorphic Encryption of Vectors: Eliminating Online Encryption via Algebraic Basis Synthesis [1.3824176915623292]
ciphertexts are constructed from precomputed encrypted basis vectors combined with a runtime-scaled encryption of zero.<n>We formalize the method as a randomized $mathbbZ_t$- module morphism and prove that it satisfies IND-CPA security under standard assumptions.<n>Unlike prior designs that require a pool of random encryptions of zero, our construction achieves equivalent security using a single zero ciphertext multiplied by a fresh scalar at runtime.
arXiv Detail & Related papers (2025-05-19T00:05:18Z)
Fast Plaintext-Ciphertext Matrix Multiplication from Additively Homomorphic Encryption [0.0]
Plaintext-ciphertext matrix multiplication is an indispensable tool in privacy-preserving computations.<n>We propose an efficient PC-MM from unpacked additively homomorphic encryption schemes.<n>Our proposed approach achieves up to an order of magnitude speedup compared to state-of-the-art for large matrices with relatively small element bit-widths.
arXiv Detail & Related papers (2025-04-20T05:28:46Z)
Code Generation for Cryptographic Kernels using Multi-word Modular Arithmetic on GPU [0.5831737970661138]
Homomorphic encryption (FHE) and zero-knowledge proofs (ZKPs) are emerging as solutions for data security in distributed environments.<n>This paper presents a formalization of multi-word modular arithmetic (MoMA), which breaks down large bit-width integer arithmetic into operations on machine words.
arXiv Detail & Related papers (2025-01-13T18:15:44Z)
New Permutation Decomposition Techniques for Efficient Homomorphic Permutation [33.30918602217202]
Homomorphic permutation is fundamental to privacy-preserving computations based on batch-encoding homomorphic encryption.<n>We propose novel decomposition techniques to optimize homomorphic permutations, advancing homomorphic encryption-based privacy-preserving computations.<n>We design a network structure that deviates from the conventional scope of decomposition and outperforms the state-of-the-art technique with a speed-up of up to $1.69times$ under a minimal rotation key requirement.
arXiv Detail & Related papers (2024-10-29T08:07:22Z)
Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass. In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z)
Encrypted Dynamic Control exploiting Limited Number of Multiplications and a Method using RLWE-based Cryptosystem [0.3749861135832073]
We present a method to encrypt dynamic controllers that can be implemented through most homomorphic encryption schemes. As a result, the encrypted controller involves only a limited number of homomorphic multiplications on every encrypted data. We propose a customization of the method for Ring Learning With Errors (RLWE)-based cryptosystems, where a vector of messages can be encrypted into a single ciphertext.
arXiv Detail & Related papers (2023-07-07T08:24:48Z)
SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers [76.13755422671822]
This paper investigates the capability of plain Vision Transformers (ViTs) for semantic segmentation using the encoder-decoder framework. We introduce a novel Attention-to-Mask (atm) module to design a lightweight decoder effective for plain ViT. Our decoder outperforms the popular decoder UPerNet using various ViT backbones while consuming only about $5%$ of the computational cost.
arXiv Detail & Related papers (2023-06-09T22:29:56Z)
Factorizers for Distributed Sparse Block Codes [45.29870215671697]
We propose a fast and highly accurate method for factorizing distributed block codes (SBCs) Our iterative factorizer introduces a threshold-based nonlinear activation, conditional random sampling, and an $ell_infty$-based similarity metric. We demonstrate the feasibility of our method on four deep CNN architectures over CIFAR-100, ImageNet-1K, and RAVEN datasets.
arXiv Detail & Related papers (2023-03-24T12:31:48Z)
ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval. It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding. We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.