if-ZKP: Intel FPGA-Based Acceleration of Zero Knowledge Proofs
- URL: http://arxiv.org/abs/2412.12481v1
- Date: Tue, 17 Dec 2024 02:35:32 GMT
- Title: if-ZKP: Intel FPGA-Based Acceleration of Zero Knowledge Proofs
- Authors: Shahzad Ahmad Butt, Benjamin Reynolds, Veeraraghavan Ramamurthy, Xiao Xiao, Pohrong Chu, Setareh Sharifian, Sergey Gribok, Bogdan Pasca,
- Abstract summary: We present a novel scalable architecture that is suitable for accelerating the zk-SNARK prover compute on FPGAs.
We focus on the multi-scalar multiplication (MSM) that accounts for the majority of time spent in zk-SNARK systems.
Our implementation runs 110x-150x faster compared to reference software library.
- Score: 3.0009885036586725
- License:
- Abstract: Zero-Knowledge Proofs (ZKPs) have emerged as an important cryptographic technique allowing one party (prover) to prove the correctness of a statement to some other party (verifier) and nothing else. ZKPs give rise to user's privacy in many applications such as blockchains, digital voting, and machine learning. Traditionally, ZKPs suffered from poor scalability but recently, a sub-class of ZKPs known as Zero-knowledge Succinct Non-interactive ARgument of Knowledges (zk-SNARKs) have addressed this challenge. They are getting significant attention and are being implemented by many public libraries. In this paper, we present a novel scalable architecture that is suitable for accelerating the zk-SNARK prover compute on FPGAs. We focus on the multi-scalar multiplication (MSM) that accounts for the majority of computation time spent in zk-SNARK systems. The MSM calculations extensive rely on modular arithmetic so highly optimized Intel IP Libraries for modular arithmetic are used. The proposed architecture exploits the parallelism inherent to MSM and is implemented using the Intel OneAPI framework for FPGAs. Our implementation runs 110x-150x faster compared to reference software library, uses a generic curve form in Jacobian coordinates and is the first to report FPGA hardware acceleration results for BLS12-381 and BN128 family of elliptic curves.
Related papers
- HW/SW Implementation of MiRitH on Embedded Platforms [2.3099144596725574]
We present to the best of our knowledge the first design space exploration of MiRitH, a promising MPCitH algorithm, for embedded devices.
We develop a library of mixed HW/SW blocks on the Xilinx ZYNQ 7000, and, based on this library, we explore optimal solutions under runtime or FPGA resource constraints.
Our results show that MiRitH is a viable algorithm for embedded devices in terms of runtime and FPGA resource requirements.
arXiv Detail & Related papers (2024-11-19T08:30:08Z) - HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in resource constrained devices [44.99833362998488]
The present work proposes a generic hardware architecture ready to be implemented on FPGA devices.
The inference speed of the design is evaluated over different resource constrained FPGA devices.
We demonstrate that our hardware-aware pruning algorithm achieves a remarkable improvement of a 45 % in inference time compared to a network pruned using the standard algorithm.
arXiv Detail & Related papers (2024-08-26T07:27:12Z) - Fast Algorithms and Implementations for Computing the Minimum Distance of Quantum Codes [43.96687298077534]
The distance of a stabilizer quantum code determines the number of errors that can be detected and corrected.
We present three new fast algorithms and implementations for computing the symplectic distance of the associated classical code.
arXiv Detail & Related papers (2024-08-20T11:24:30Z) - SZKP: A Scalable Accelerator Architecture for Zero-Knowledge Proofs [10.603449308259496]
ZKPs are an emergent paradigm in verifiable computing.
Two key primitives in proof generation are the Number Theoretic Transform (NTT) and Multi-scalar multiplication (MSM)
We present SZKP, a scalable accelerator framework that is the first ASIC to accelerate an entire proof on-chip.
arXiv Detail & Related papers (2024-08-12T01:53:58Z) - Highly Versatile FPGA-Implemented Cyber Coherent Ising Machine [0.7950056272504447]
We have developed an FPGA implemented cyber coherent Ising machine (cyber CIM) that is much more versatile than previous implementations using FPGAs.
Our architecture is versatile since it can be applied to the open-loop CIM, which was proposed when CIM research began, to the closed-loop CIM.
The cyber CIM enables applications such as CDMA multi-user detector and L0 compressed sensing which were not possible with earlier FPGA systems.
arXiv Detail & Related papers (2024-06-08T07:09:27Z) - Many-body computing on Field Programmable Gate Arrays [5.3808713424582395]
We leverage the capabilities of Field Programmable Gate Arrays (FPGAs) for conducting quantum many-body calculations.
This has resulted in a tenfold speedup compared to CPU-based computation for a Monte Carlo algorithm.
For the first time, the utilization of FPGA to accelerate a typical tensor network algorithm for many-body ground state calculations.
arXiv Detail & Related papers (2024-02-09T14:01:02Z) - Trainable Fixed-Point Quantization for Deep Learning Acceleration on
FPGAs [30.325651150798915]
Quantization is a crucial technique for deploying deep learning models on resource-constrained devices, such as embedded FPGAs.
We present QFX, a trainable fixed-point quantization approach that automatically learns the binary-point position during model training.
QFX is implemented as a PyTorch-based library that efficiently emulates fixed-point arithmetic, supported by FPGA HLS.
arXiv Detail & Related papers (2024-01-31T02:18:27Z) - Extreme Compression of Large Language Models via Additive Quantization [59.3122859349777]
Our algorithm, called AQLM, generalizes the classic Additive Quantization (AQ) approach for information retrieval.
We provide fast GPU and CPU implementations of AQLM for token generation, which enable us to match or outperform optimized FP16 implementations for speed.
arXiv Detail & Related papers (2024-01-11T18:54:44Z) - An FPGA-based Solution for Convolution Operation Acceleration [0.0]
This paper proposes an FPGA-based architecture to accelerate the convolution operation.
The project's purpose is to produce an FPGA IP core that can process a convolutional layer at a time.
arXiv Detail & Related papers (2022-06-09T14:12:30Z) - Providing Meaningful Data Summarizations Using Examplar-based Clustering
in Industry 4.0 [67.80123919697971]
We show, that our GPU implementation provides speedups of up to 72x using single-precision and up to 452x using half-precision compared to conventional CPU algorithms.
We apply our algorithm to real-world data from injection molding manufacturing processes and discuss how found summaries help with steering this specific process to cut costs and reduce the manufacturing of bad parts.
arXiv Detail & Related papers (2021-05-25T15:55:14Z) - Predictive Coding Approximates Backprop along Arbitrary Computation
Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents.
Our models perform equivalently to backprop on challenging machine learning benchmarks.
Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.