TPU as Cryptographic Accelerator
- URL: http://arxiv.org/abs/2307.06554v3
- Date: Wed, 02 Oct 2024 22:57:37 GMT
- Title: TPU as Cryptographic Accelerator
- Authors: Rabimba Karanjai, Sangwon Shin, and Wujie Xiong, Xinxin Fan, Lin Chen, Tianwei Zhang, Taeweon Suh, Weidong Shi, Veronika Kuchta, Francesco Sica, Lei Xu,
- Abstract summary: Cryptographic schemes like Fully Homomorphic Encryption (FHE) and Zero-Knowledge Proofs (ZKPs) are often hindered by their computational complexity.
This paper explores the potential of leveraging TPUs/NPUs to accelerate cryptographic multiplication, thereby enhancing the performance of FHE and ZKP schemes.
- Score: 13.44836928672667
- License:
- Abstract: Cryptographic schemes like Fully Homomorphic Encryption (FHE) and Zero-Knowledge Proofs (ZKPs), while offering powerful privacy-preserving capabilities, are often hindered by their computational complexity. Polynomial multiplication, a core operation in these schemes, is a major performance bottleneck. While algorithmic advancements and specialized hardware like GPUs and FPGAs have shown promise in accelerating these computations, the recent surge in AI accelerators (TPUs/NPUs) presents a new opportunity. This paper explores the potential of leveraging TPUs/NPUs to accelerate polynomial multiplication, thereby enhancing the performance of FHE and ZKP schemes. We present techniques to adapt polynomial multiplication to these AI-centric architectures and provide a preliminary evaluation of their effectiveness. We also discuss current limitations and outline future directions for further performance improvements, paving the way for wider adoption of advanced cryptographic tools.
Related papers
- Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms [77.71341200638416]
ChiPBench is a benchmark designed to evaluate the effectiveness of AI-based chip placement algorithms.
We have gathered 20 circuits from various domains (e.g., CPU, GPU, and microcontrollers) for evaluation.
Results show that even if intermediate metric of a single-point algorithm is dominant, the final PPA results are unsatisfactory.
arXiv Detail & Related papers (2024-07-03T03:29:23Z) - Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs.
At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads.
At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z) - Many-body computing on Field Programmable Gate Arrays [5.612626580467746]
We leverage the capabilities of Field Programmable Gate Arrays (FPGAs) for conducting quantum many-body calculations.
This has resulted in a remarkable tenfold speedup compared to CPU-based computation.
arXiv Detail & Related papers (2024-02-09T14:01:02Z) - Exploration of TPUs for AI Applications [0.0]
Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google.
This paper aims to explore TPUs in cloud and edge computing focusing on its applications in AI.
arXiv Detail & Related papers (2023-09-16T07:58:05Z) - Higher-order topological kernels via quantum computation [68.8204255655161]
Topological data analysis (TDA) has emerged as a powerful tool for extracting meaningful insights from complex data.
We propose a quantum approach to defining Betti kernels, which is based on constructing Betti curves with increasing order.
arXiv Detail & Related papers (2023-07-14T14:48:52Z) - GloptiNets: Scalable Non-Convex Optimization with Certificates [61.50835040805378]
We present a novel approach to non-cube optimization with certificates, which handles smooth functions on the hypercube or on the torus.
By exploiting the regularity of the target function intrinsic in the decay of its spectrum, we allow at the same time to obtain precise certificates and leverage the advanced and powerful neural networks.
arXiv Detail & Related papers (2023-06-26T09:42:59Z) - Hardware Acceleration of Explainable Artificial Intelligence [5.076419064097733]
We propose a simple yet efficient framework to accelerate various XAI algorithms with existing hardware accelerators.
Our proposed approach can lead to real-time outcome interpretation.
arXiv Detail & Related papers (2023-05-04T19:07:29Z) - Decomposition of Matrix Product States into Shallow Quantum Circuits [62.5210028594015]
tensor network (TN) algorithms can be mapped to parametrized quantum circuits (PQCs)
We propose a new protocol for approximating TN states using realistic quantum circuits.
Our results reveal one particular protocol, involving sequential growth and optimization of the quantum circuit, to outperform all other methods.
arXiv Detail & Related papers (2022-09-01T17:08:41Z) - Polynomial unconstrained binary optimisation inspired by optical
simulation [52.11703556419582]
We propose an algorithm inspired by optical coherent Ising machines to solve the problem of unconstrained binary optimization.
We benchmark the proposed algorithm against existing PUBO algorithms, and observe its superior performance.
The application of our algorithm to protein folding and quantum chemistry problems sheds light on the shortcomings of approxing the electronic structure problem by a PUBO problem.
arXiv Detail & Related papers (2021-06-24T16:39:31Z) - Demystifying BERT: Implications for Accelerator Design [4.80595971865854]
We focus on BERT, one of the most popular NLP transfer learning algorithms, to identify how its algorithmic behavior can guide future accelerator design.
We characterize compute-intensive BERT computations and discuss software and possible hardware mechanisms to further optimize these computations.
Overall, our analysis identifies holistic solutions to optimize systems for BERT-like models.
arXiv Detail & Related papers (2021-04-14T01:06:49Z) - Predictive Coding Approximates Backprop along Arbitrary Computation
Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents.
Our models perform equivalently to backprop on challenging machine learning benchmarks.
Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.