Related papers: Physics-inspired Ising Computing with Ring Oscillator Activated p-bits

Physics-inspired Ising Computing with Ring Oscillator Activated p-bits

URL: http://arxiv.org/abs/2205.07402v1
Date: Sun, 15 May 2022 23:46:58 GMT
Title: Physics-inspired Ising Computing with Ring Oscillator Activated p-bits
Authors: Navid Anjum Aadit, Andrea Grimaldi, Giovanni Finocchio, and Kerem Y. Camsari
Abstract summary: We design and implement a truly asynchronous and medium-scale p-computer with $$ 800 p-bits. We evaluate the performance of the asynchronous architecture against an ideal, synchronous design. Our results highlight the promise of massively scaled p-computers with millions of free-running p-bits.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The nearing end of Moore's Law has been driving the development of domain-specific hardware tailored to solve a special set of problems. Along these lines, probabilistic computing with inherently stochastic building blocks (p-bits) have shown significant promise, particularly in the context of hard optimization and statistical sampling problems. p-bits have been proposed and demonstrated in different hardware substrates ranging from small-scale stochastic magnetic tunnel junctions (sMTJs) in asynchronous architectures to large-scale CMOS in synchronous architectures. Here, we design and implement a truly asynchronous and medium-scale p-computer (with $\approx$ 800 p-bits) that closely emulates the asynchronous dynamics of sMTJs in Field Programmable Gate Arrays (FPGAs). Using hard instances of the planted Ising glass problem on the Chimera lattice, we evaluate the performance of the asynchronous architecture against an ideal, synchronous design that performs parallelized (chromatic) exact Gibbs sampling. We find that despite the lack of any careful synchronization, the asynchronous design achieves parallelism with comparable algorithmic scaling in the ideal, carefully tuned and parallelized synchronous design. Our results highlight the promise of massively scaled p-computers with millions of free-running p-bits made out of nanoscale building blocks such as stochastic magnetic tunnel junctions.

Related papers

Pushing the Boundary of Quantum Advantage in Hard Combinatorial Optimization with Probabilistic Computers [0.4969640751053581]
We show that p-computers can surpass state-of-the-art quantum annealers in solving hard optimization problems. We show that these algorithms are readily implementable in modern hardware thanks to the mature semiconductor technology. Our results raise the bar for a practical quantum advantage in optimization and present p-computers as scalable, energy-efficient hardware.
arXiv Detail & Related papers (2025-03-13T12:24:13Z)
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity [92.1840862558718]
Ringmaster ASGD achieves optimal time complexity under arbitrarily heterogeneous computation times. This makes it the first Asynchronous SGD method to meet the theoretical lower bounds for time complexity in such scenarios.
arXiv Detail & Related papers (2025-01-27T16:07:26Z)
MindFlayer: Efficient Asynchronous Parallel SGD in the Presence of Heterogeneous and Random Worker Compute Times [49.1574468325115]
We study the problem of minimizing the expectation of smooth non functions with the help of several parallel workers. We propose a new asynchronous SGD method, Mindlayer SGD, in which the noise is heavy tailed. Our theory empirical results demonstrate the superiority of Mindlayer SGD in cases when the noise is heavy tailed.
arXiv Detail & Related papers (2024-10-05T21:11:32Z)
Freya PAGE: First Optimal Time Complexity for Large-Scale Nonconvex Finite-Sum Optimization with Heterogeneous Asynchronous Computations [92.1840862558718]
In practical distributed systems, workers typically not homogeneous, and can have highly varying processing times. We introduce a new parallel method Freya to handle arbitrarily slow computations. We show that Freya offers significantly improved complexity guarantees compared to all previous methods.
arXiv Detail & Related papers (2024-05-24T13:33:30Z)
Scalable Parity Architecture With a Shuttling-Based Spin Qubit Processor [0.32985979395737786]
We present sequences of spin shuttling and quantum gates that implement the Parity Quantum Approximate Optimization Algorithm (QAOA) We develop a detailed error model for a hardware-specific analysis of the Parity Architecture. We find that with high-fidelity spin shuttling the performance of the spin qubits is competitive or even exceeds the results of the transmons.
arXiv Detail & Related papers (2024-03-14T17:06:50Z)
Parallelized Spatiotemporal Binding [47.67393266882402]
We introduce Parallelizable Spatiotemporal Binder or PSB, the first temporally-parallelizable slot learning architecture for sequential inputs. Unlike conventional RNN-based approaches, PSB produces object-centric representations, known as slots, for all time-steps in parallel. Compared to the state-of-the-art, our architecture demonstrates stable training on longer sequences, achieves parallelization that results in a 60% increase in training speed, and yields performance that is on par with or better on unsupervised 2D and 3D object-centric scene decomposition and understanding.
arXiv Detail & Related papers (2024-02-26T23:16:34Z)
Robust Fully-Asynchronous Methods for Distributed Training over General Architecture [11.480605289411807]
Perfect synchronization in distributed machine learning problems is inefficient and even impossible due to the existence of latency, package losses and stragglers. We propose Fully-Asynchronous Gradient Tracking method (R-FAST), where each device performs local computation and communication at its own without any form of impact.
arXiv Detail & Related papers (2023-07-21T14:36:40Z)
Biologically Plausible Learning on Neuromorphic Hardware Architectures [27.138481022472]
Neuromorphic computing is an emerging paradigm that confronts this imbalance by computations directly in analog memories. This work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa.
arXiv Detail & Related papers (2022-12-29T15:10:59Z)
Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA [5.575293536755127]
Real-world applications require high performance inference on real-time streaming dynamic graphs. We present a novel model-architecture co-design for inference in memory-based TGNNs on FPGAs. We train our simplified models using knowledge distillation to ensure similar accuracy vis-'a-vis the original model.
arXiv Detail & Related papers (2022-03-10T00:24:47Z)
Scaling Quantum Approximate Optimization on Near-term Hardware [49.94954584453379]
We quantify scaling of the expected resource requirements by optimized circuits for hardware architectures with varying levels of connectivity. We show the number of measurements, and hence total time to synthesizing solution, grows exponentially in problem size and problem graph degree. These problems may be alleviated by increasing hardware connectivity or by recently proposed modifications to the QAOA that achieve higher performance with fewer circuit layers.
arXiv Detail & Related papers (2022-01-06T21:02:30Z)
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers [55.90468016961356]
We propose an efficient token mixer that learns to mix in the Fourier domain. AFNO is based on a principled foundation of operator learning. It can handle a sequence size of 65k and outperforms other efficient self-attention mechanisms.
arXiv Detail & Related papers (2021-11-24T05:44:31Z)
Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD) We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.