Related papers: Structural Credit Assignment with Coordinated Exploration

Structural Credit Assignment with Coordinated Exploration

URL: http://arxiv.org/abs/2307.13256v1
Date: Tue, 25 Jul 2023 04:55:45 GMT
Title: Structural Credit Assignment with Coordinated Exploration
Authors: Stephen Chung
Abstract summary: Methods aimed at improving structural credit assignment can generally be classified into two categories. We propose the use of Boltzmann machines or a recurrent network for coordinated exploration. Experimental results demonstrate that coordinated exploration significantly exceeds independent exploration in training speed.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A biologically plausible method for training an Artificial Neural Network (ANN) involves treating each unit as a stochastic Reinforcement Learning (RL) agent, thereby considering the network as a team of agents. Consequently, all units can learn via REINFORCE, a local learning rule modulated by a global reward signal, which aligns more closely with biologically observed forms of synaptic plasticity. However, this learning method tends to be slow and does not scale well with the size of the network. This inefficiency arises from two factors impeding effective structural credit assignment: (i) all units independently explore the network, and (ii) a single reward is used to evaluate the actions of all units. Accordingly, methods aimed at improving structural credit assignment can generally be classified into two categories. The first category includes algorithms that enable coordinated exploration among units, such as MAP propagation. The second category encompasses algorithms that compute a more specific reward signal for each unit within the network, like Weight Maximization and its variants. In this research report, our focus is on the first category. We propose the use of Boltzmann machines or a recurrent network for coordinated exploration. We show that the negative phase, which is typically necessary to train Boltzmann machines, can be removed. The resulting learning rules are similar to the reward-modulated Hebbian learning rule. Experimental results demonstrate that coordinated exploration significantly exceeds independent exploration in training speed for multiple stochastic and discrete units based on REINFORCE, even surpassing straight-through estimator (STE) backpropagation.

Related papers

Unbiased Weight Maximization [0.0]
We propose a new learning rule for a network of Bernoulli-logistic units that is unbiased and scales well with the number of network's units in terms of learning speed. Notably, to our knowledge, this is the first learning rule for a network of Bernoulli-logistic units that is unbiased and scales well with the number of network's units in terms of learning speed.
arXiv Detail & Related papers (2023-07-25T05:45:52Z)
Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs. We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z)
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality. Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z)
Learning Modular Structures That Generalize Out-of-Distribution [1.7034813545878589]
We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains. Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network.
arXiv Detail & Related papers (2022-08-07T15:54:19Z)
Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation. In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN. Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z)
Training Generative Adversarial Networks in One Stage [58.983325666852856]
We introduce a general training scheme that enables training GANs efficiently in only one stage. We show that the proposed method is readily applicable to other adversarial-training scenarios, such as data-free knowledge distillation.
arXiv Detail & Related papers (2021-02-28T09:03:39Z)
Learning by Competition of Self-Interested Reinforcement Learning Agents [0.0]
An artificial neural network can be trained by uniformly broadcasting a reward signal to units that implement a REINFORCE learning rule. We propose replacing the reward signal to hidden units with the change in the $L2$ norm of the unit's outgoing weight. Our experiments show that a network trained with Weight Maximization can learn significantly faster than REINFORCE and slightly slower than backpropagation.
arXiv Detail & Related papers (2020-10-19T18:18:53Z)
MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning Agents [0.0]
An alternative way of training an artificial neural network is through treating each unit in the network as a reinforcement learning agent. We propose a novel algorithm called MAP propagation to reduce this variance significantly. Our work thus allows for the broader application of the teams of agents in deep reinforcement learning.
arXiv Detail & Related papers (2020-10-15T17:17:39Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.