Related papers: Convergence and scaling of Boolean-weight optimization for hardware reservoirs

Convergence and scaling of Boolean-weight optimization for hardware reservoirs

URL: http://arxiv.org/abs/2305.07908v1
Date: Sat, 13 May 2023 12:15:25 GMT
Title: Convergence and scaling of Boolean-weight optimization for hardware reservoirs
Authors: Louis Andreoli, St\'ephane Chr\'etien, Xavier Porte, Daniel Brunner
Abstract summary: We analytically derive the scaling laws for highly efficient Coordinate Descent applied to optimize the readout layer of a random recurrently connection neural network. Our results perfectly reproduce the convergence and scaling of a large-scale photonic reservoir implemented in a proof-of-concept experiment.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hardware implementation of neural network are an essential step to implement next generation efficient and powerful artificial intelligence solutions. Besides the realization of a parallel, efficient and scalable hardware architecture, the optimization of the system's extremely large parameter space with sampling-efficient approaches is essential. Here, we analytically derive the scaling laws for highly efficient Coordinate Descent applied to optimizing the readout layer of a random recurrently connection neural network, a reservoir. We demonstrate that the convergence is exponential and scales linear with the network's number of neurons. Our results perfectly reproduce the convergence and scaling of a large-scale photonic reservoir implemented in a proof-of-concept experiment. Our work therefore provides a solid foundation for such optimization in hardware networks, and identifies future directions that are promising for optimizing convergence speed during learning leveraging measures of a neural network's amplitude statistics and the weight update rule.

Related papers

Performance Analysis of Convolutional Neural Network By Applying Unconstrained Binary Quadratic Programming [0.0]
Convolutional Neural Networks (CNNs) are pivotal in computer vision and Big Data analytics but demand significant computational resources when trained on large-scale datasets.<n>We propose a hybrid optimization method that combines Unconstrained Binary Quadratic Programming (UBQP) with Gradient Descent (SGD) to accelerate CNN training.<n>Our approach achieves a 10--15% accuracy improvement over a standard BP-CNN baseline while maintaining similar execution times.
arXiv Detail & Related papers (2025-05-30T21:25:31Z)
Quantum Convolutional Neural Network with Flexible Stride [7.362858964229726]
We propose a novel quantum convolutional neural network algorithm. It can flexibly adjust the stride to accommodate different tasks. It can achieve exponential acceleration of data scale in less memory compared with its classical counterpart.
arXiv Detail & Related papers (2024-12-01T02:37:06Z)
Training Hamiltonian neural networks without backpropagation [0.0]
We present a backpropagation-free algorithm to accelerate the training of neural networks for approximating Hamiltonian systems. We show that our approach is more than 100 times faster with CPUs than the traditionally trained Hamiltonian Neural Networks.
arXiv Detail & Related papers (2024-11-26T15:22:30Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Logarithmically Quantized Distributed Optimization over Dynamic Multi-Agent Networks [0.951494089949975]
We propose distributed optimization dynamics over multi-agent networks subject to logarithmically quantized data transmission. As compared to uniform quantization, this allows for higher precision in representing near-optimal values and more accuracy of the distributed optimization algorithm.
arXiv Detail & Related papers (2024-10-27T06:01:01Z)
Geometry Aware Meta-Learning Neural Network for Joint Phase and Precoder Optimization in RIS [9.20186865054847]
We propose a complex-valued, geometry aware meta-learning neural network that maximizes the weighted sum rate in a multi-user multiple input single output system. We use a complex-valued neural network for phase shifts and an Euler inspired update for the precoder network. Our approach outperforms existing neural network-based algorithms, offering higher weighted sum rates, lower power consumption, and significantly faster convergence.
arXiv Detail & Related papers (2024-09-17T15:20:23Z)
Sparks of Quantum Advantage and Rapid Retraining in Machine Learning [0.0]
In this study, we optimize a powerful neural network architecture for representing complex functions with minimal parameters. We introduce rapid retraining capability, enabling the network to be retrained with new data without reprocessing old samples. Our findings suggest that with further advancements in quantum hardware and algorithm optimization, quantum-optimized machine learning models could have broad applications.
arXiv Detail & Related papers (2024-07-22T19:55:44Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Acceleration techniques for optimization over trained neural network ensembles [1.0323063834827415]
We study optimization problems where the objective function is modeled through feedforward neural networks with rectified linear unit activation. We present a mixed-integer linear program based on existing popular big-$M$ formulations for optimizing over a single neural network.
arXiv Detail & Related papers (2021-12-13T20:50:54Z)
Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent. We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z)
Understanding the Effects of Data Parallelism and Sparsity on Neural Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity. Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.