Related papers: M-ary Precomputation-Based Accelerated Scalar Multiplication Algorithms for Enhanced Elliptic Curve Cryptography

M-ary Precomputation-Based Accelerated Scalar Multiplication Algorithms for Enhanced Elliptic Curve Cryptography

URL: http://arxiv.org/abs/2505.01845v1
Date: Sat, 03 May 2025 15:18:54 GMT
Title: M-ary Precomputation-Based Accelerated Scalar Multiplication Algorithms for Enhanced Elliptic Curve Cryptography
Authors: Tongxi Wu, Xufeng Liu, Jin Yang, Yijie Zhu, Shunyang Zeng, Mingming Zhan,
Abstract summary: This paper proposes an M-ary precomputation-based scalar multiplication algorithm, aiming to optimize both computational efficiency and memory usage.<n> Experiments on ElGamal encryption and NS3-based communication simulations validate its effectiveness.<n>The binary-optimized variant reduces communication time by 22.1% on secp384r1 and simulation time by 25.4% on secp521r1.
Score: 2.8614337550669324
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Efficient scalar multiplication is critical for enhancing the performance of elliptic curve cryptography (ECC), especially in applications requiring large-scale or real-time cryptographic operations. This paper proposes an M-ary precomputation-based scalar multiplication algorithm, aiming to optimize both computational efficiency and memory usage. The method reduces the time complexity from $\Theta(Q \log p)$ to $\Theta\left(\frac{Q \log p}{\log Q}\right)$ and achieves a memory complexity of $\Theta\left(\frac{Q \log p}{\log^2 Q}\right)$. Experiments on ElGamal encryption and NS3-based communication simulations validate its effectiveness. On secp256k1, the proposed method achieves up to a 59\% reduction in encryption time and 30\% memory savings. In network simulations, the binary-optimized variant reduces communication time by 22.1\% on secp384r1 and simulation time by 25.4\% on secp521r1. The results demonstrate the scalability, efficiency, and practical applicability of the proposed algorithm. The source code will be publicly released upon acceptance.

Related papers

Fast correlated decoding of transversal logical algorithms [67.01652927671279]
Quantum error correction (QEC) is required for large-scale computation, but incurs a significant resource overhead.<n>Recent advances have shown that by jointly decoding logical qubits in algorithms composed of logical gates, the number of syndrome extraction rounds can be reduced.<n>Here, we reform the problem of decoding circuits by directly decoding relevant logical operator products as they propagate through the circuit.
arXiv Detail & Related papers (2025-05-19T18:00:00Z)
An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks [8.779871128906787]
We propose algorithms to improve the inference time and memory efficiency of Deep Neural Networks (DNNs)<n>We focus on matrix multiplication as the bottleneck operation of inference.<n>Our experiments show up to a 5.24x speedup in the inference time.
arXiv Detail & Related papers (2024-11-10T04:56:14Z)
An Efficient Algorithm for Modulus Operation and Its Hardware Implementation in Prime Number Calculation [0.0]
The proposed algorithm use only addition, subtraction, logical, and bit shift operations.<n>It addresses scalability challenges in cryptographic applications.<n>The application of this algorithm in prime number calculation up to 500,000 shows its practical utility and performance advantages.
arXiv Detail & Related papers (2024-07-17T13:24:52Z)
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference [19.167604927651073]
Auto-regressive decoding of Large Language Models (LLMs) results in significant overheads in their hardware performance. We propose a novel parallel prompt decoding that requires only $0.0002$% trainable parameters, enabling efficient training on a single A100-40GB GPU in just 16 hours. Our approach demonstrates up to 2.49$times$ speedup and maintains a minimal memory overhead of just $0.0004$%.
arXiv Detail & Related papers (2024-05-28T22:19:30Z)
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching [53.91395791840179]
We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs. USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables.
arXiv Detail & Related papers (2023-12-19T02:27:22Z)
Improved Rate of First Order Algorithms for Entropic Optimal Transport [2.1485350418225244]
This paper improves the state-of-the-art rate of a first-order algorithm for solving entropy regularized optimal transport. We propose an accelerated primal-dual mirror descent algorithm with variance reduction. Our algorithm may inspire more research to develop accelerated primal-dual algorithms that have rate $widetildeO(n2/epsilon)$ for solving OT.
arXiv Detail & Related papers (2023-01-23T19:13:25Z)
RSC: Accelerating Graph Neural Networks Training via Randomized Sparse Computations [56.59168541623729]
Training graph neural networks (GNNs) is time consuming because sparse graph-based operations are hard to be accelerated by hardware. We explore trading off the computational precision to reduce the time complexity via sampling-based approximation. We propose Randomized Sparse Computation, which for the first time demonstrate the potential of training GNNs with approximated operations.
arXiv Detail & Related papers (2022-10-19T17:25:33Z)
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching. Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z)
Rapid Person Re-Identification via Sub-space Consistency Regularization [51.76876061721556]
Person Re-Identification (ReID) matches pedestrians across disjoint cameras. Existing ReID methods adopting real-value feature descriptors have achieved high accuracy, but they are low in efficiency due to the slow Euclidean distance computation. We propose a novel Sub-space Consistency Regularization (SCR) algorithm that can speed up the ReID procedure by 0.25$ times.
arXiv Detail & Related papers (2022-07-13T02:44:05Z)
Private Frequency Estimation via Projective Geometry [47.112770141205864]
We propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $varepsilon$-LDP algorithm has communication cost $lceillogkrceil bits in the private coin setting and $varepsilonlog e + O(1)$ in the public coin setting. In many parameter settings used in practice this is a significant improvement over the O(n+k2)$optimal cost that is achieved by the recent PI-
arXiv Detail & Related papers (2022-03-01T02:49:55Z)
Fast and Scalable Computation of the Forward and Inverse Discrete Periodic Radon Transform [2.2940141855172027]
The Discrete Periodic Radon Transform (DPRT) has been extensively used in applications that involve image reconstructions from projections. This manuscript introduces a fast and scalable approach for computing the forward and inverse DPRT.
arXiv Detail & Related papers (2021-12-24T22:33:13Z)
VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator. textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.