Related papers: Scalable Thermodynamic Second-order Optimization

Scalable Thermodynamic Second-order Optimization

URL: http://arxiv.org/abs/2502.08603v1
Date: Wed, 12 Feb 2025 17:44:40 GMT
Title: Scalable Thermodynamic Second-order Optimization
Authors: Kaelan Donatella, Samuel Duffield, Denis Melanson, Maxwell Aifer, Phoebe Klett, Rajath Salegame, Zach Belateche, Gavin Crooks, Antonio J. Martinez, Patrick J. Coles,
Abstract summary: We propose a scalable algorithm for employing computers to accelerate a popular second-order thermodynamic curvature called Kron-ed approximate curvature (K-FAC)<n> Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved.<n>We predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many hardware proposals have aimed to accelerate inference in AI workloads. Less attention has been paid to hardware acceleration of training, despite the enormous societal impact of rapid training of AI models. Physics-based computers, such as thermodynamic computers, offer an efficient means to solve key primitives in AI training algorithms. Optimizers that normally would be computationally out-of-reach (e.g., due to expensive matrix inversions) on digital hardware could be unlocked with physics-based hardware. In this work, we propose a scalable algorithm for employing thermodynamic computers to accelerate a popular second-order optimizer called Kronecker-factored approximate curvature (K-FAC). Our asymptotic complexity analysis predicts increasing advantage with our algorithm as $n$, the number of neurons per layer, increases. Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved. Finally, we predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics.

Related papers

Perturbation-efficient Zeroth-order Optimization for Hardware-friendly On-device Training [48.91359197313493]
Zeroth-order (ZO) optimization is an emerging deep neural network (DNN) training paradigm that offers computational simplicity and memory savings.<n>ZO requires generating a substantial number of Gaussian random numbers, which poses significant difficulties and even makes it infeasible for hardware platforms, such as FPGAs and ASICs.<n>We propose PeZO, a perturbation-efficient ZO framework that significantly reduces the demand for random number generation.<n>Our experiments show that PeZO reduces the required LUTs and FFs for random number generation by 48.6% and 12.7%, and saves at maximum 86% power consumption
arXiv Detail & Related papers (2025-04-28T23:58:07Z)
Pushing the Boundary of Quantum Advantage in Hard Combinatorial Optimization with Probabilistic Computers [0.4969640751053581]
We show that probabilistic computers (p-computers) provide a compelling and scalable classical pathway for solving hard optimization problems.<n>We focus on two key algorithms applied to 3D spin glasses: discrete-time simulated quantum annealing (DT-SQA) and adaptive parallel tempering (APT)<n>We show that APT, when supported by non-local isoenergetic cluster moves, exhibits a more favorable scaling and ultimately outperforms DT-SQA.
arXiv Detail & Related papers (2025-03-13T12:24:13Z)
Dynamic Range Reduction via Branch-and-Bound [1.533133219129073]
Key strategy to enhance hardware accelerators is the reduction of precision in arithmetic operations. This paper introduces a fully principled Branch-and-Bound algorithm for reducing precision needs in QUBO problems. Experiments validate our algorithm's effectiveness on an actual quantum annealer.
arXiv Detail & Related papers (2024-09-17T03:07:56Z)
Sparks of Quantum Advantage and Rapid Retraining in Machine Learning [0.0]
In this study, we optimize a powerful neural network architecture for representing complex functions with minimal parameters. We introduce rapid retraining capability, enabling the network to be retrained with new data without reprocessing old samples. Our findings suggest that with further advancements in quantum hardware and algorithm optimization, quantum-optimized machine learning models could have broad applications.
arXiv Detail & Related papers (2024-07-22T19:55:44Z)
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching [53.91395791840179]
We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs. USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables.
arXiv Detail & Related papers (2023-12-19T02:27:22Z)
Fast Numerical Solver of Ising Optimization Problems via Pruning and Domain Selection [4.460518115427853]
We propose a fast and efficient solver for the Ising optimization problems. Our solver can be an order of magnitude faster than the classical solver, and at least two times faster than the quantum-inspired annealers.
arXiv Detail & Related papers (2023-12-10T09:43:15Z)
Thermodynamic Computing System for AI Applications [0.0]
Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives. We present the first continuous-variable thermodynamic computer, which we call the processing unit (SPU)
arXiv Detail & Related papers (2023-12-08T05:22:04Z)
Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability. One promising solution is to revisit analogue computing, a technique that predates digital computing. Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z)
Towards provably efficient quantum algorithms for large-scale machine-learning models [11.440134080370811]
We show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms. We benchmark instances of large machine learning models from 7 million to 103 million parameters.
arXiv Detail & Related papers (2023-03-06T19:00:27Z)
NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger. It decomposing the original learning tasks into several coarser-resolution subtasks. We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z)
Thermodynamic AI and the fluctuation frontier [0.0]
Many Artificial Intelligence (AI) algorithms are inspired by physics and employ fluctuations. We propose a novel computing paradigm, where software and hardware become inseparable. We identify bits (s-bits) and modes (s-modes) as the respective building blocks for discrete and continuous Thermodynamic AI hardware.
arXiv Detail & Related papers (2023-02-09T17:18:36Z)
Variational Quantum Optimization with Multi-Basis Encodings [62.72309460291971]
We introduce a new variational quantum algorithm that benefits from two innovations: multi-basis graph complexity and nonlinear activation functions. Our results in increased optimization performance, two increase in effective landscapes and a reduction in measurement progress.
arXiv Detail & Related papers (2021-06-24T20:16:02Z)
Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)
One-step regression and classification with crosspoint resistive memory arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge. One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition. Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.