Related papers: Effective Learning with Node Perturbation in Multi-Layer Neural Networks

Effective Learning with Node Perturbation in Multi-Layer Neural Networks

URL: http://arxiv.org/abs/2310.00965v4
Date: Mon, 27 May 2024 14:15:45 GMT
Title: Effective Learning with Node Perturbation in Multi-Layer Neural Networks
Authors: Sander Dalm, Marcel van Gerven, Nasir Ahmad,
Abstract summary: node perturbation (NP) proposes learning by the injection of noise into network activations. NP is highly data inefficient and unstable due to its unguided noise-based search process. We find that a closer alignment with directional derivatives together with input decorrelation at every layer strongly enhances performance of NP learning.
Score: 2.1168858935852013
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Backpropagation (BP) remains the dominant and most successful method for training parameters of deep neural network models. However, BP relies on two computationally distinct phases, does not provide a satisfactory explanation of biological learning, and can be challenging to apply for training of networks with discontinuities or noisy node dynamics. By comparison, node perturbation (NP) proposes learning by the injection of noise into network activations, and subsequent measurement of the induced loss change. NP relies on two forward (inference) passes, does not make use of network derivatives, and has been proposed as a model for learning in biological systems. However, standard NP is highly data inefficient and unstable due to its unguided noise-based search process. In this work, we investigate different formulations of NP and relate it to the concept of directional derivatives as well as combining it with a decorrelating mechanism for layer-wise inputs. We find that a closer alignment with directional derivatives together with input decorrelation at every layer strongly enhances performance of NP learning with large improvements in parameter convergence and much higher performance on the test data, approaching that of BP. Furthermore, our novel formulation allows for application to noisy systems in which the noise process itself is inaccessible.

Related papers

Exploring Pseudo-Token Approaches in Transformer Neural Processes [0.0]
We introduce the Induced Set Attentive Neural Processes (ISANPs) ISANPs perform competitively with Transformer Neural Processes (TNPs) and often surpass state-of-the-art models in 1D regression, image completion, contextual bandits, and Bayesian optimization. ISANPs offer a tunable balance between performance and computational complexity, which scale well to larger datasets.
arXiv Detail & Related papers (2025-04-19T22:47:59Z)
DeepDFA: Automata Learning through Neural Probabilistic Relaxations [2.3326951882644553]
We introduce DeepDFA, a novel approach to identifying Deterministic Finite Automata (DFAs) from traces. Inspired by both the probabilistic relaxation of DFAs and Recurrent Neural Networks (RNNs), our model offers interpretability post-training, alongside reduced complexity and enhanced training efficiency.
arXiv Detail & Related papers (2024-08-16T09:30:36Z)
Online Pseudo-Zeroth-Order Training of Neuromorphic Spiking Neural Networks [69.2642802272367]
Brain-inspired neuromorphic computing with spiking neural networks (SNNs) is a promising energy-efficient computational approach. Most recent methods leverage spatial and temporal backpropagation (BP), not adhering to neuromorphic properties. We propose a novel method, online pseudo-zeroth-order (OPZO) training.
arXiv Detail & Related papers (2024-07-17T12:09:00Z)
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics [6.349503549199403]
We provide a comprehensive framework for the learning process of deep wide neural networks.<n>By characterizing the diffusive phase, our work sheds light on representational drift in the brain.
arXiv Detail & Related papers (2023-09-08T18:00:01Z)
The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF. Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples. In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
FFNB: Forgetting-Free Neural Blocks for Deep Continual Visual Learning [14.924672048447338]
We devise a dynamic network architecture for continual learning based on a novel forgetting-free neural block (FFNB) Training FFNB features on new tasks is achieved using a novel procedure that constrains the underlying parameters in the null-space of the previous tasks.
arXiv Detail & Related papers (2021-11-22T17:23:34Z)
CAN-PINN: A Fast Physics-Informed Neural Network Based on Coupled-Automatic-Numerical Differentiation Method [17.04611875126544]
Novel physics-informed neural network (PINN) methods for coupling neighboring support points and automatic differentiation (AD) through Taylor series expansion are proposed. The proposed coupled-automatic-numerical differentiation framework, labeled as can-PINN, unifies the advantages of AD and ND, providing more robust and efficient training than AD-based PINNs.
arXiv Detail & Related papers (2021-10-29T14:52:46Z)
Solving Sparse Linear Inverse Problems in Communication Systems: A Deep Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems. The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z)
Belief Propagation Neural Networks [103.97004780313105]
We introduce belief propagation neural networks (BPNNs) BPNNs operate on factor graphs and generalize Belief propagation (BP) We show that BPNNs converges 1.7x faster on Ising models while providing tighter bounds. On challenging model counting problems, BPNNs compute estimates 100's of times faster than state-of-the-art handcrafted methods.
arXiv Detail & Related papers (2020-07-01T07:39:51Z)
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
Semi-Implicit Back Propagation [1.5533842336139065]
We propose a semi-implicit back propagation method for neural network training. The difference on the neurons are propagated in a backward fashion and the parameters are updated with proximal mapping. Experiments on both MNIST and CIFAR-10 demonstrate that the proposed algorithm leads to better performance in terms of both loss decreasing and training/validation accuracy.
arXiv Detail & Related papers (2020-02-10T03:26:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.