Related papers: Cheddar: A Swift Fully Homomorphic Encryption Library for CUDA GPUs

Cheddar: A Swift Fully Homomorphic Encryption Library for CUDA GPUs

URL: http://arxiv.org/abs/2407.13055v1
Date: Wed, 17 Jul 2024 23:49:18 GMT
Title: Cheddar: A Swift Fully Homomorphic Encryption Library for CUDA GPUs
Authors: Jongmin Kim, Wonseok Choi, Jung Ho Ahn,
Abstract summary: Fully homomorphic encryption (FHE) is a cryptographic technology capable of resolving security and privacy problems in cloud computing by encrypting data in use. FHE introduces tremendous computational overhead for processing encrypted data, causing FHE workloads to become 2-6 orders of magnitude slower than their unencrypted counterparts. We propose Cheddar, an FHE library for GPU, which demonstrates significantly faster performance compared to prior GPU implementations.
Score: 2.613335121517245
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fully homomorphic encryption (FHE) is a cryptographic technology capable of resolving security and privacy problems in cloud computing by encrypting data in use. However, FHE introduces tremendous computational overhead for processing encrypted data, causing FHE workloads to become 2-6 orders of magnitude slower than their unencrypted counterparts. To mitigate the overhead, we propose Cheddar, an FHE library for CUDA GPUs, which demonstrates significantly faster performance compared to prior GPU implementations. We develop optimized functionalities at various implementation levels ranging from efficient low-level primitives to streamlined high-level operational sequences. Especially, we improve major FHE operations, including number-theoretic transform and base conversion, based on efficient kernel designs using a small word size of 32 bits. By these means, Cheddar demonstrates 2.9 to 25.6 times higher performance for representative FHE workloads compared to prior GPU implementations.

Related papers

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float [71.43026659686679]
Large Language Models (LLMs) have grown rapidly in size, creating challenges for efficient deployment on resource-constrained hardware. We introduce Dynamic-Length Float (DFloat11), a compression framework that reduces LLM size by 30% while preserving outputs that are bit-for-bit identical to the original model.
arXiv Detail & Related papers (2025-04-15T22:38:38Z)
Ramp Up NTT in Record Time using GPU-Accelerated Algorithms and LLM-based Code Generation [11.120838175165986]
Homomorphic encryption (HE) is a core building block in privacy-preserving machine learning (PPML) Many GPU-accelerated cryptographic schemes have been proposed to improve the performance of HE. Given the powerful code generation capabilities of large language models (LLMs), we aim to explore their potential to automatically generate practical GPU-friendly algorithm code.
arXiv Detail & Related papers (2025-02-16T12:53:23Z)
Chameleon: An Efficient FHE Scheme Switching Acceleration on GPUs [17.536473118470774]
homomorphic encryption (FHE) enables direct computation on encrypted data. Existing efforts primarily focus on single-class FHE schemes, which fail to meet the diverse requirements of data types and functions. We present an efficient GPU-based FHE switching acceleration scheme named Chameleon.
arXiv Detail & Related papers (2024-10-08T11:37:49Z)
NTTSuite: Number Theoretic Transform Benchmarks for Accelerating Encrypted Computation [2.704681057324485]
Homomorphic encryption (HE) is a cryptographic system that enables computation directly on encrypted data. HE has seen little adoption due to extremely high computational overheads, rendering it impractical. We develop a benchmark suite, named NTTSuite, to enable researchers to better address these overheads. We find our implementation outperforms the state-of-the-art by 30%.
arXiv Detail & Related papers (2024-05-18T17:44:17Z)
FHEmem: A Processing In-Memory Accelerator for Fully Homomorphic Encryption [9.884698447131374]
Homomorphic Encryption (FHE) is a technique that allows arbitrary computations to be performed on encrypted data without the need for decryption. FHE is significantly slower than computation on plain data due to the increase in data size after encryption. We propose a PIM-based FHE accelerator, FHEmem, which exploits a novel processing in-memory architecture.
arXiv Detail & Related papers (2023-11-27T20:11:38Z)
GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption [33.87964584665433]
Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE introduces a slowdown of up to five orders of magnitude as compared to the same computation using plaintext data. We propose GME, which combines three key microarchitectural extensions along with a compile-time optimization to the current AMD CDNA GPU architecture.
arXiv Detail & Related papers (2023-09-20T01:50:43Z)
INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing [66.00729477511219]
Given a function represented as a computation graph, traditional architectures face challenges in efficiently computing its nth-order gradient. We introduce INR-Arch, a framework that transforms the computation graph of an nth-order gradient into a hardware-optimized dataflow architecture. We present results that demonstrate 1.8-4.8x and 1.5-3.6x speedup compared to CPU and GPU baselines respectively.
arXiv Detail & Related papers (2023-08-11T04:24:39Z)
ArctyrEX : Accelerated Encrypted Execution of General-Purpose Applications [6.19586646316608]
Fully Homomorphic Encryption (FHE) is a cryptographic method that guarantees the privacy and security of user data during computation. We develop new techniques for accelerated encrypted execution and demonstrate the significant performance advantages of our approach.
arXiv Detail & Related papers (2023-06-19T15:15:41Z)
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures [67.47328776279204]
This work introduces a framework to develop efficient, portable Deep Learning and High Performance Computing kernels. We decompose the kernel development in two steps: 1) Expressing the computational core using Processing Primitives (TPPs) and 2) Expressing the logical loops around TPPs in a high-level, declarative fashion. We demonstrate the efficacy of our approach using standalone kernels and end-to-end workloads that outperform state-of-the-art implementations on diverse CPU platforms.
arXiv Detail & Related papers (2023-04-25T05:04:44Z)
HDCC: A Hyperdimensional Computing compiler for classification on embedded systems and high-performance computing [58.720142291102135]
This work introduces the name compiler, the first open-source compiler that translates high-level descriptions of HDC classification methods into optimized C code. name is designed like a modern compiler, featuring an intuitive and descriptive input language, an intermediate representation (IR), and a retargetable backend. To substantiate these claims, we conducted experiments with HDCC on several of the most popular datasets in the HDC literature.
arXiv Detail & Related papers (2023-04-24T19:16:03Z)
ASH: A Modern Framework for Parallel Spatial Hashing in 3D Perception [91.24236600199542]
ASH is a modern and high-performance framework for parallel spatial hashing on GPU. ASH achieves higher performance, supports richer functionality, and requires fewer lines of code. ASH and its example applications are open sourced in Open3D.
arXiv Detail & Related papers (2021-10-01T16:25:40Z)
Efficient and Generic 1D Dilated Convolution Layer for Deep Learning [52.899995651639436]
We introduce our efficient implementation of a generic 1D convolution layer covering a wide range of parameters. It is optimized for x86 CPU architectures, in particular, for architectures containing Intel AVX-512 and AVX-512 BFloat16 instructions. We demonstrate the performance of our optimized 1D convolution layer by utilizing it in the end-to-end neural network training with real genomics datasets.
arXiv Detail & Related papers (2021-04-16T09:54:30Z)
Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field. We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem. We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.