Structural Credit Assignment with Coordinated Exploration
- URL: http://arxiv.org/abs/2307.13256v1
- Date: Tue, 25 Jul 2023 04:55:45 GMT
- Title: Structural Credit Assignment with Coordinated Exploration
- Authors: Stephen Chung
- Abstract summary: Methods aimed at improving structural credit assignment can generally be classified into two categories.
We propose the use of Boltzmann machines or a recurrent network for coordinated exploration.
Experimental results demonstrate that coordinated exploration significantly exceeds independent exploration in training speed.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A biologically plausible method for training an Artificial Neural Network
(ANN) involves treating each unit as a stochastic Reinforcement Learning (RL)
agent, thereby considering the network as a team of agents. Consequently, all
units can learn via REINFORCE, a local learning rule modulated by a global
reward signal, which aligns more closely with biologically observed forms of
synaptic plasticity. However, this learning method tends to be slow and does
not scale well with the size of the network. This inefficiency arises from two
factors impeding effective structural credit assignment: (i) all units
independently explore the network, and (ii) a single reward is used to evaluate
the actions of all units. Accordingly, methods aimed at improving structural
credit assignment can generally be classified into two categories. The first
category includes algorithms that enable coordinated exploration among units,
such as MAP propagation. The second category encompasses algorithms that
compute a more specific reward signal for each unit within the network, like
Weight Maximization and its variants. In this research report, our focus is on
the first category. We propose the use of Boltzmann machines or a recurrent
network for coordinated exploration. We show that the negative phase, which is
typically necessary to train Boltzmann machines, can be removed. The resulting
learning rules are similar to the reward-modulated Hebbian learning rule.
Experimental results demonstrate that coordinated exploration significantly
exceeds independent exploration in training speed for multiple stochastic and
discrete units based on REINFORCE, even surpassing straight-through estimator
(STE) backpropagation.
Related papers
- Unbiased Weight Maximization [0.0]
We propose a new learning rule for a network of Bernoulli-logistic units that is unbiased and scales well with the number of network's units in terms of learning speed.
Notably, to our knowledge, this is the first learning rule for a network of Bernoulli-logistic units that is unbiased and scales well with the number of network's units in terms of learning speed.
arXiv Detail & Related papers (2023-07-25T05:45:52Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Learning Modular Structures That Generalize Out-of-Distribution [1.7034813545878589]
We describe a method for O.O.D. generalization that, through training, encourages models to only preserve features in the network that are well reused across multiple training domains.
Our method combines two complementary neuron-level regularizers with a probabilistic differentiable binary mask over the network, to extract a modular sub-network that achieves better O.O.D. performance than the original network.
arXiv Detail & Related papers (2022-08-07T15:54:19Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Training Generative Adversarial Networks in One Stage [58.983325666852856]
We introduce a general training scheme that enables training GANs efficiently in only one stage.
We show that the proposed method is readily applicable to other adversarial-training scenarios, such as data-free knowledge distillation.
arXiv Detail & Related papers (2021-02-28T09:03:39Z) - Learning by Competition of Self-Interested Reinforcement Learning Agents [0.0]
An artificial neural network can be trained by uniformly broadcasting a reward signal to units that implement a REINFORCE learning rule.
We propose replacing the reward signal to hidden units with the change in the $L2$ norm of the unit's outgoing weight.
Our experiments show that a network trained with Weight Maximization can learn significantly faster than REINFORCE and slightly slower than backpropagation.
arXiv Detail & Related papers (2020-10-19T18:18:53Z) - MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement
Learning Agents [0.0]
An alternative way of training an artificial neural network is through treating each unit in the network as a reinforcement learning agent.
We propose a novel algorithm called MAP propagation to reduce this variance significantly.
Our work thus allows for the broader application of the teams of agents in deep reinforcement learning.
arXiv Detail & Related papers (2020-10-15T17:17:39Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.