Related papers: Prime and Modulate Learning: Generation of forward models with signed back-propagation and environmental cues

Prime and Modulate Learning: Generation of forward models with signed back-propagation and environmental cues

URL: http://arxiv.org/abs/2309.03825v1
Date: Thu, 7 Sep 2023 16:34:30 GMT
Title: Prime and Modulate Learning: Generation of forward models with signed back-propagation and environmental cues
Authors: Sama Daryanavard, Bernd Porr
Abstract summary: Deep neural networks employing error back-propagation for learning can suffer from exploding and vanishing gradient problems. In this work we follow a different approach where back-propagation makes exclusive use of the sign of the error signal to prime the learning. We present a mathematical derivation of the learning rule in z-space and demonstrate the real-time performance with a robotic platform.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks employing error back-propagation for learning can suffer from exploding and vanishing gradient problems. Numerous solutions have been proposed such as normalisation techniques or limiting activation functions to linear rectifying units. In this work we follow a different approach which is particularly applicable to closed-loop learning of forward models where back-propagation makes exclusive use of the sign of the error signal to prime the learning, whilst a global relevance signal modulates the rate of learning. This is inspired by the interaction between local plasticity and a global neuromodulation. For example, whilst driving on an empty road, one can allow for slow step-wise optimisation of actions, whereas, at a busy junction, an error must be corrected at once. Hence, the error is the priming signal and the intensity of the experience is a modulating factor in the weight change. The advantages of this Prime and Modulate paradigm is twofold: it is free from normalisation and it makes use of relevant cues from the environment to enrich the learning. We present a mathematical derivation of the learning rule in z-space and demonstrate the real-time performance with a robotic platform. The results show a significant improvement in the speed of convergence compared to that of the conventional back-propagation.

Related papers

Error Sensitivity Modulation based Experience Replay: Mitigating Abrupt Representation Drift in Continual Learning [13.041607703862724]
We propose ESMER, which employs a principled mechanism to modulate error sensitivity in a dual-memory rehearsal-based system. ESMER effectively reduces forgetting and abrupt drift in representations at the task boundary by gradually adapting to the new task while consolidating knowledge. Remarkably, it also enables the model to learn under high levels of label noise, which is ubiquitous in real-world data streams.
arXiv Detail & Related papers (2023-02-14T16:35:54Z)
Random Weight Factorization Improves the Training of Continuous Neural Representations [1.911678487931003]
Continuous neural representations have emerged as a powerful and flexible alternative to classical discretized representations of signals. We propose random weight factorization as a simple drop-in replacement for parameterizing and initializing conventional linear layers. We show how this factorization alters the underlying loss landscape and effectively enables each neuron in the network to learn using its own self-adaptive learning rate.
arXiv Detail & Related papers (2022-10-03T23:48:48Z)
Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks [62.48782506095565]
We show that due to the greedy nature of learning in deep neural networks, models tend to rely on just one modality while under-fitting the other modalities. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning.
arXiv Detail & Related papers (2022-02-10T20:11:21Z)
Error-driven Input Modulation: Solving the Credit Assignment Problem without a Backward Pass [8.39059551023011]
Supervised learning in artificial neural networks typically relies on backpropagation. We show that this approach lacks biological plausibility in many regards. We propose to replace the backward pass with a second forward pass in which the input signal is modulated based on the error of the network.
arXiv Detail & Related papers (2022-01-27T17:12:05Z)
Learning in Sinusoidal Spaces with Physics-Informed Neural Networks [22.47355575565345]
A physics-informed neural network (PINN) uses physics-augmented loss functions to ensure its output is consistent with fundamental physics laws. It turns out to be difficult to train an accurate PINN model for many problems in practice.
arXiv Detail & Related papers (2021-09-20T07:42:41Z)
Characterizing possible failure modes in physics-informed neural networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs. We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z)
Learning to Learn to Demodulate with Uncertainty Quantification via Bayesian Meta-Learning [59.014197664747165]
We introduce the use of Bayesian meta-learning via variational inference for the purpose of obtaining well-calibrated few-pilot demodulators. The resulting Bayesian ensembles offer better calibrated soft decisions, at the computational cost of running multiple instances of the neural network for demodulation.
arXiv Detail & Related papers (2021-08-02T11:07:46Z)
Learning Frequency Domain Approximation for Binary Neural Networks [68.79904499480025]
We propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.
arXiv Detail & Related papers (2021-03-01T08:25:26Z)
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training [59.160154997555956]
We present GradInit, an automated and architecture method for initializing neural networks. It is based on a simple agnostic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value. It also enables training the original Post-LN Transformer for machine translation without learning rate warmup.
arXiv Detail & Related papers (2021-02-16T11:45:35Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
A data-driven choice of misfit function for FWI using reinforcement learning [0.0]
We use a deep-Q network (DQN) to learn an optimal policy to determine the proper timing to switch between different misfit functions. Specifically, we train the state-action value function (Q) to predict when to use the conventional L2-norm misfit function or the more advanced optimal-transport matching-filter (OTMF) misfit.
arXiv Detail & Related papers (2020-02-08T12:31:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.