Related papers: Highly Versatile FPGA-Implemented Cyber Coherent Ising Machine

Highly Versatile FPGA-Implemented Cyber Coherent Ising Machine

URL: http://arxiv.org/abs/2406.05377v1
Date: Sat, 8 Jun 2024 07:09:27 GMT
Title: Highly Versatile FPGA-Implemented Cyber Coherent Ising Machine
Authors: Toru Aonishi, Tatsuya Nagasawa, Toshiyuki Koizumi, Mastiyage Don Sudeera Hasaranga Gunathilaka, Kazushi Mimura, Masato Okada, Satoshi Kako, Yoshihisa Yamamoto,
Abstract summary: We have developed an FPGA implemented cyber coherent Ising machine (cyber CIM) that is much more versatile than previous implementations using FPGAs. Our architecture is versatile since it can be applied to the open-loop CIM, which was proposed when CIM research began, to the closed-loop CIM. The cyber CIM enables applications such as CDMA multi-user detector and L0 compressed sensing which were not possible with earlier FPGA systems.
Score: 0.7950056272504447
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, quantum Ising machines have drawn a lot of attention, but due to physical implementation constraints, it has been difficult to achieve dense coupling, such as full coupling with sufficient spins to handle practical large-scale applications. Consequently, classically computable equations have been derived from quantum master equations for these quantum Ising machines. Parallel implementations of these algorithms using FPGAs have been used to rapidly find solutions to these problems on a scale that is difficult to achieve in physical systems. We have developed an FPGA implemented cyber coherent Ising machine (cyber CIM) that is much more versatile than previous implementations using FPGAs. Our architecture is versatile since it can be applied to the open-loop CIM, which was proposed when CIM research began, to the closed-loop CIM, which has been used recently, as well as to Jacobi successive over-relaxation method. By modifying the sequence control code for the calculation control module, other algorithms such as Simulated Bifurcation (SB) can also be implemented. Earlier research on large-scale FPGA implementations of SB and CIM used binary or ternary discrete values for connections, whereas the cyber CIM used FP32 values. Also, the cyber CIM utilized Zeeman terms that were represented as FP32, which were not present in other large-scale FPGA systems. Our implementation with continuous interaction realizes N=4096 on a single FPGA, comparable to the single-FPGA implementation of SB with binary interactions, with N=4096. The cyber CIM enables applications such as CDMA multi-user detector and L0 compressed sensing which were not possible with earlier FPGA systems, while enabling superior calculation speeds, more than ten times faster than a GPU implementation. The calculation speed can be further improved by increasing parallelism, such as through clustering.

Related papers

if-ZKP: Intel FPGA-Based Acceleration of Zero Knowledge Proofs [3.0009885036586725]
We present a novel scalable architecture that is suitable for accelerating the zk-SNARK prover compute on FPGAs. We focus on the multi-scalar multiplication (MSM) that accounts for the majority of time spent in zk-SNARK systems. Our implementation runs 110x-150x faster compared to reference software library.
arXiv Detail & Related papers (2024-12-17T02:35:32Z)
Fast Algorithms and Implementations for Computing the Minimum Distance of Quantum Codes [43.96687298077534]
The distance of a stabilizer quantum code determines the number of errors that can be detected and corrected. We present three new fast algorithms and implementations for computing the symplectic distance of the associated classical code.
arXiv Detail & Related papers (2024-08-20T11:24:30Z)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs. At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads. At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z)
Many-body computing on Field Programmable Gate Arrays [5.612626580467746]
We leverage the capabilities of Field Programmable Gate Arrays (FPGAs) for conducting quantum many-body calculations. This has resulted in a remarkable tenfold speedup compared to CPU-based computation.
arXiv Detail & Related papers (2024-02-09T14:01:02Z)
Reconfigurable Distributed FPGA Cluster Design for Deep Learning Accelerators [59.11160990637615]
We propose a distributed system based on lowpower embedded FPGAs designed for edge computing applications. The proposed system can simultaneously execute diverse Neural Network (NN) models, arrange the graph in a pipeline structure, and manually allocate greater resources to the most computationally intensive layers of the NN graph.
arXiv Detail & Related papers (2023-05-24T16:08:55Z)
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs [49.358119307844035]
We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs) This makes efficient NN implementations in hardware accessible to nonexperts, in a single open-sourced workflow. We demonstrate the workflow in a particle physics application involving trigger decisions that must operate at the 40 MHz collision rate of the Large Hadron Collider (LHC) We implement an optimized mixed-precision NN for high-momentum particle jets in simulated LHC proton-proton collisions.
arXiv Detail & Related papers (2023-04-13T18:00:01Z)
LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics [45.666822327616046]
This work presents a novel reconfigurable architecture for Low Graph Neural Network (LL-GNN) designs for particle detectors. The LL-GNN design advances the next generation of trigger systems by enabling sophisticated algorithms to process experimental data efficiently.
arXiv Detail & Related papers (2022-09-28T12:55:35Z)
Optimization of FPGA-based CNN Accelerators Using Metaheuristics [1.854931308524932]
convolutional neural networks (CNNs) have demonstrated their ability to solve problems in many fields. FPGAs have seen a surge in interest for accelerating CNN inference. Current trend in FPGA-based CNN accelerators is to implement multiple convolutional layer processors (CLPs)
arXiv Detail & Related papers (2022-09-22T18:57:49Z)
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design [66.39546326221176]
Attention-based neural networks have become pervasive in many AI tasks. The use of the attention mechanism and feed-forward network (FFN) demands excessive computational and memory resources. This paper proposes a hardware-friendly variant that adopts a unified butterfly sparsity pattern to approximate both the attention mechanism and the FFNs.
arXiv Detail & Related papers (2022-09-20T09:28:26Z)
Decomposition of Matrix Product States into Shallow Quantum Circuits [62.5210028594015]
tensor network (TN) algorithms can be mapped to parametrized quantum circuits (PQCs) We propose a new protocol for approximating TN states using realistic quantum circuits. Our results reveal one particular protocol, involving sequential growth and optimization of the quantum circuit, to outperform all other methods.
arXiv Detail & Related papers (2022-09-01T17:08:41Z)
Towards real-time and energy efficient Siamese tracking -- a hardware-software approach [0.0]
We propose a hardware-software implementation of the well-known fully connected Siamese tracker (SiamFC) We have developed a quantised Siamese network for the FINN accelerator, using algorithm-accelerator co-design, and performed design space exploration. For our network, running in the programmable logic part of the Zynq UltraScale+ MPSoC ZCU104, we achieved the processing of almost 50 frames-per-second with tracker accuracy on par with its floating point counterpart.
arXiv Detail & Related papers (2022-05-21T18:31:07Z)
Systolic-CNN: An OpenCL-defined Scalable Run-time-flexible FPGA Accelerator Architecture for Accelerating Convolutional Neural Network Inference in Cloud/Edge Computing [8.826181951806928]
Systolic-CNN is an OpenCL-defined scalable, run-time-flexible FPGA accelerator architecture. Systolic-CNN is optimized for accelerating the inference of various convolutional neural networks (CNNs) in multi-tenancy cloud/edge computing.
arXiv Detail & Related papers (2020-12-06T03:53:11Z)
Electronic structure with direct diagonalization on a D-Wave quantum annealer [62.997667081978825]
This work implements the general Quantum Annealer Eigensolver (QAE) algorithm to solve the molecular electronic Hamiltonian eigenvalue-eigenvector problem on a D-Wave 2000Q quantum annealer. We demonstrate the use of D-Wave hardware for obtaining ground and electronically excited states across a variety of small molecular systems.
arXiv Detail & Related papers (2020-09-02T22:46:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.