Energy-based learning algorithms for analog computing: a comparative
study
- URL: http://arxiv.org/abs/2312.15103v1
- Date: Fri, 22 Dec 2023 22:49:58 GMT
- Title: Energy-based learning algorithms for analog computing: a comparative
study
- Authors: Benjamin Scellier, Maxence Ernoult, Jack Kendall, Suhas Kumar
- Abstract summary: Energy-based learning algorithms have recently gained a surge of interest due to their compatibility with analog hardware.
We compare seven learning algorithms, namely contrastive learning (CL), equilibrium propagation (EP) and coupled learning (CpL)
We find that negative perturbations are better than positive ones, and highlight the centered variant of EP as the best-performing algorithm.
- Score: 2.0937431058291933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Energy-based learning algorithms have recently gained a surge of interest due
to their compatibility with analog (post-digital) hardware. Existing algorithms
include contrastive learning (CL), equilibrium propagation (EP) and coupled
learning (CpL), all consisting in contrasting two states, and differing in the
type of perturbation used to obtain the second state from the first one.
However, these algorithms have never been explicitly compared on equal footing
with same models and datasets, making it difficult to assess their scalability
and decide which one to select in practice. In this work, we carry out a
comparison of seven learning algorithms, namely CL and different variants of EP
and CpL depending on the signs of the perturbations. Specifically, using these
learning algorithms, we train deep convolutional Hopfield networks (DCHNs) on
five vision tasks (MNIST, F-MNIST, SVHN, CIFAR-10 and CIFAR-100). We find that,
while all algorithms yield comparable performance on MNIST, important
differences in performance arise as the difficulty of the task increases. Our
key findings reveal that negative perturbations are better than positive ones,
and highlight the centered variant of EP (which uses two perturbations of
opposite sign) as the best-performing algorithm. We also endorse these findings
with theoretical arguments. Additionally, we establish new SOTA results with
DCHNs on all five datasets, both in performance and speed. In particular, our
DCHN simulations are 13.5 times faster with respect to Laborieux et al. (2021),
which we achieve thanks to the use of a novel energy minimisation algorithm
based on asynchronous updates, combined with reduced precision (16 bits).
Related papers
- Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization [18.035417008213077]
Recent advancements include ensemble multi-environment hybrid Q-learning algorithms.
We show that our algorithm can achieve %50 less policy error and %40 less runtime complexity than state-of-the-art reinforcement learning algorithms.
arXiv Detail & Related papers (2024-08-29T20:09:20Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Federated Learning via Inexact ADMM [46.99210047518554]
In this paper, we develop an inexact alternating direction method of multipliers (ADMM)
It is both- and communication-efficient, capable of combating the stragglers' effect, and convergent under mild conditions.
It has a high numerical performance compared with several state-of-the-art algorithms for federated learning.
arXiv Detail & Related papers (2022-04-22T09:55:33Z) - Recursive Least Squares Advantage Actor-Critic Algorithms [20.792917267835247]
We propose two novel RLS-based advantage actor critic (A2C) algorithms.
RLSSA2C and RLSNA2C, use the RLS method to train the critic network and the hidden layers of the actor network.
From the experimental results, it is shown that our both algorithms have better sample efficiency than the vanilla A2C on most games or tasks.
arXiv Detail & Related papers (2022-01-15T20:00:26Z) - A Pragmatic Look at Deep Imitation Learning [0.3626013617212666]
We re-implement 6 different adversarial imitation learning algorithms.
We evaluate them on a widely-used expert trajectory dataset.
GAIL consistently performs well across a range of sample sizes.
arXiv Detail & Related papers (2021-08-04T06:33:10Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Waypoint Planning Networks [66.72790309889432]
We propose a hybrid algorithm based on LSTMs with a local kernel - a classic algorithm such as A*, and a global kernel using a learned algorithm.
We compare WPN against A*, as well as related works including motion planning networks (MPNet) and value networks (VIN)
It is shown that WPN's search space is considerably less than A*, while being able to generate near optimal results.
arXiv Detail & Related papers (2021-05-01T18:02:01Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.