Related papers: Fast multiplication by two's complement addition of numbers represented as a set of polynomial radix 2 indexes, stored as an integer list for massively parallel computation

Fast multiplication by two's complement addition of numbers represented as a set of polynomial radix 2 indexes, stored as an integer list for massively parallel computation

URL: http://arxiv.org/abs/2311.09922v3
Date: Sun, 28 Jul 2024 08:36:02 GMT
Title: Fast multiplication by two's complement addition of numbers represented as a set of polynomial radix 2 indexes, stored as an integer list for massively parallel computation
Authors: Mark Stocks,
Abstract summary: The 'polynomial integer index multiplication' method is a set of algorithms implemented in python code. We demonstrate the method to be faster than both the Number Theoretic Transform (NTT) and Karatsuba for multiplication within a certain bit range.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We demonstrate a multiplication method based on numbers represented as set of polynomial radix 2 indices stored as an integer list. The 'polynomial integer index multiplication' method is a set of algorithms implemented in python code. We demonstrate the method to be faster than both the Number Theoretic Transform (NTT) and Karatsuba for multiplication within a certain bit range. Also implemented in python code for comparison purposes with the polynomial radix 2 integer method. We demonstrate that it is possible to express any integer or real number as a list of integer indices, representing a finite series in base two. The finite series of integer index representation of a number can then be stored and distributed across multiple CPUs / GPUs. We show that operations of addition and multiplication can be applied as two's complement additions operating on the index integer representations and can be fully distributed across a given CPU / GPU architecture. We demonstrate fully distributed arithmetic operations such that the 'polynomial integer index multiplication' method overcomes the current limitation of parallel multiplication methods. Ie, the need to share common core memory and common disk for the calculation of results and intermediate results.

Related papers

A Summation-Based Algorithm For Integer Factorization [0.0]
This paper introduces a new method that converts an integer into a sum in base-2. It plays a crucial role in modern cryptography, notably, in the security of RSA encryption.
arXiv Detail & Related papers (2025-04-29T20:35:43Z)
A Fast Multiplication Algorithm and RLWE-PLWE Equivalence for the Maximal Real Subfield of the $2^r p^s$-th Cyclotomic Field [0.0]
We prove the RLWE-PLWE equivalence for the maximal real subfields of the cyclotomic fields with conductor $n = 2r ps$. We also describe a fast multiplication algorithm in the ring of integers of these real subfields.
arXiv Detail & Related papers (2025-04-07T15:01:48Z)
An average case efficient algorithm for solving two variable linear diophantine equations [0.0]
Solving two variable linear diophantine equations has applications in many cryptographic protocols such as RSA and Elliptic curve cryptography. We revisit two algorithms to solve two variable linear diophantine equations.
arXiv Detail & Related papers (2024-09-21T07:51:12Z)
Low-Complexity Integer Divider Architecture for Homomorphic Encryption [5.857929080874288]
Homomorphic encryption (HE) allows computations to be directly carried out on ciphertexts and enables privacy-preserving cloud computing. An algorithm is proposed to compute the quotient and vigorous mathematical proofs are provided.
arXiv Detail & Related papers (2024-01-19T23:53:59Z)
CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA. By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z)
Randomized Polar Codes for Anytime Distributed Machine Learning [66.46612460837147]
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery. We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization.
arXiv Detail & Related papers (2023-09-01T18:02:04Z)
Scaling Integer Arithmetic in Probabilistic Programs [21.26857534769979]
We present a binary encoding strategy for discrete distributions that exploits the rich logical structure of integer operations. We show that this approach scales to much larger integer distributions with arithmetic.
arXiv Detail & Related papers (2023-07-25T22:21:07Z)
Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z)
An Efficient Summation Algorithm for the Accuracy, Convergence and Reproducibility of Parallel Numerical Methods [0.0]
We have introduced a new parallel algorithm for summing a sequence of floating-point numbers. This algorithm which scales up easily with the number of processors, adds numbers of the same exponents first. In this article, our main contribution is an extensive analysis of its efficiency with respect to several properties.
arXiv Detail & Related papers (2022-05-11T08:31:48Z)
Fast Differentiable Matrix Square Root and Inverse Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root and the inverse square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad'e Approximants (MPA) A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration.
arXiv Detail & Related papers (2022-01-29T10:00:35Z)
Fast Differentiable Matrix Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP) The other method is to use Matrix Pad'e Approximants (MPA)
arXiv Detail & Related papers (2022-01-21T12:18:06Z)
Efficient Floating Point Arithmetic for Quantum Computers [1.189955933770711]
One of the major promises of quantum computing is the realization of SIMD (single instruction - multiple data) operations using the phenomenon of superposition. We introduce the formalism of encoding so called semi-booleans, which allows convenient generation of unsigned integer arithmetic quantum circuits. We extend this type of evaluation with additional features, such as ancilla-free in-place multiplication and integer coefficient evaluation.
arXiv Detail & Related papers (2021-12-20T14:00:36Z)
Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.