Scalable Thermodynamic Second-order Optimization
- URL: http://arxiv.org/abs/2502.08603v1
- Date: Wed, 12 Feb 2025 17:44:40 GMT
- Title: Scalable Thermodynamic Second-order Optimization
- Authors: Kaelan Donatella, Samuel Duffield, Denis Melanson, Maxwell Aifer, Phoebe Klett, Rajath Salegame, Zach Belateche, Gavin Crooks, Antonio J. Martinez, Patrick J. Coles,
- Abstract summary: We propose a scalable algorithm for employing computers to accelerate a popular second-order thermodynamic curvature called Kron-ed approximate curvature (K-FAC)
Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved.
We predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics.
- Score: 0.0
- License:
- Abstract: Many hardware proposals have aimed to accelerate inference in AI workloads. Less attention has been paid to hardware acceleration of training, despite the enormous societal impact of rapid training of AI models. Physics-based computers, such as thermodynamic computers, offer an efficient means to solve key primitives in AI training algorithms. Optimizers that normally would be computationally out-of-reach (e.g., due to expensive matrix inversions) on digital hardware could be unlocked with physics-based hardware. In this work, we propose a scalable algorithm for employing thermodynamic computers to accelerate a popular second-order optimizer called Kronecker-factored approximate curvature (K-FAC). Our asymptotic complexity analysis predicts increasing advantage with our algorithm as $n$, the number of neurons per layer, increases. Numerical experiments show that even under significant quantization noise, the benefits of second-order optimization can be preserved. Finally, we predict substantial speedups for large-scale vision and graph problems based on realistic hardware characteristics.
Related papers
- Dynamic Range Reduction via Branch-and-Bound [1.533133219129073]
Key strategy to enhance hardware accelerators is the reduction of precision in arithmetic operations.
This paper introduces a fully principled Branch-and-Bound algorithm for reducing precision needs in QUBO problems.
Experiments validate our algorithm's effectiveness on an actual quantum annealer.
arXiv Detail & Related papers (2024-09-17T03:07:56Z) - Fast, Scalable, Warm-Start Semidefinite Programming with Spectral
Bundling and Sketching [53.91395791840179]
We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs.
USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables.
arXiv Detail & Related papers (2023-12-19T02:27:22Z) - Fast Numerical Solver of Ising Optimization Problems via Pruning and Domain Selection [4.460518115427853]
We propose a fast and efficient solver for the Ising optimization problems.
Our solver can be an order of magnitude faster than the classical solver, and at least two times faster than the quantum-inspired annealers.
arXiv Detail & Related papers (2023-12-10T09:43:15Z) - Thermodynamic Computing System for AI Applications [0.0]
Physics-based hardware, such as thermodynamic computing, has the potential to provide a fast, low-power means to accelerate AI primitives.
We present the first continuous-variable thermodynamic computer, which we call the processing unit (SPU)
arXiv Detail & Related papers (2023-12-08T05:22:04Z) - Pruning random resistive memory for optimizing analogue AI [54.21621702814583]
AI models present unprecedented challenges to energy consumption and environmental sustainability.
One promising solution is to revisit analogue computing, a technique that predates digital computing.
Here, we report a universal solution, software-hardware co-design using structural plasticity-inspired edge pruning.
arXiv Detail & Related papers (2023-11-13T08:59:01Z) - Towards provably efficient quantum algorithms for large-scale
machine-learning models [11.440134080370811]
We show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms.
We benchmark instances of large machine learning models from 7 million to 103 million parameters.
arXiv Detail & Related papers (2023-03-06T19:00:27Z) - NeuralStagger: Accelerating Physics-constrained Neural PDE Solver with
Spatial-temporal Decomposition [67.46012350241969]
This paper proposes a general acceleration methodology called NeuralStagger.
It decomposing the original learning tasks into several coarser-resolution subtasks.
We demonstrate the successful application of NeuralStagger on 2D and 3D fluid dynamics simulations.
arXiv Detail & Related papers (2023-02-20T19:36:52Z) - Thermodynamic AI and the fluctuation frontier [0.0]
Many Artificial Intelligence (AI) algorithms are inspired by physics and employ fluctuations.
We propose a novel computing paradigm, where software and hardware become inseparable.
We identify bits (s-bits) and modes (s-modes) as the respective building blocks for discrete and continuous Thermodynamic AI hardware.
arXiv Detail & Related papers (2023-02-09T17:18:36Z) - Variational Quantum Optimization with Multi-Basis Encodings [62.72309460291971]
We introduce a new variational quantum algorithm that benefits from two innovations: multi-basis graph complexity and nonlinear activation functions.
Our results in increased optimization performance, two increase in effective landscapes and a reduction in measurement progress.
arXiv Detail & Related papers (2021-06-24T20:16:02Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.