Layer Collaboration in the Forward-Forward Algorithm
- URL: http://arxiv.org/abs/2305.12393v1
- Date: Sun, 21 May 2023 08:12:54 GMT
- Title: Layer Collaboration in the Forward-Forward Algorithm
- Authors: Guy Lorberbom, Itai Gat, Yossi Adi, Alex Schwing, Tamir Hazan
- Abstract summary: We study layer collaboration in the forward-forward algorithm.
We show that the current version of the forward-forward algorithm is suboptimal when considering information flow in the network.
We propose an improved version that supports layer collaboration to better utilize the network structure.
- Score: 28.856139738073626
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Backpropagation, which uses the chain rule, is the de-facto standard
algorithm for optimizing neural networks nowadays. Recently, Hinton (2022)
proposed the forward-forward algorithm, a promising alternative that optimizes
neural nets layer-by-layer, without propagating gradients throughout the
network. Although such an approach has several advantages over back-propagation
and shows promising results, the fact that each layer is being trained
independently limits the optimization process. Specifically, it prevents the
network's layers from collaborating to learn complex and rich features. In this
work, we study layer collaboration in the forward-forward algorithm. We show
that the current version of the forward-forward algorithm is suboptimal when
considering information flow in the network, resulting in a lack of
collaboration between layers of the network. We propose an improved version
that supports layer collaboration to better utilize the network structure,
while not requiring any additional assumptions or computations. We empirically
demonstrate the efficacy of the proposed version when considering both
information flow and objective metrics. Additionally, we provide a theoretical
motivation for the proposed method, inspired by functional entropy theory.
Related papers
- WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z) - Optimisation & Generalisation in Networks of Neurons [8.078758339149822]
The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks.
A new theoretical framework is proposed for deriving architecture-dependent first-order optimisation algorithms.
A new correspondence is proposed between ensembles of networks and individual networks.
arXiv Detail & Related papers (2022-10-18T18:58:40Z) - CCasGNN: Collaborative Cascade Prediction Based on Graph Neural Networks [0.49269463638915806]
Cascade prediction aims at modeling information diffusion in the network.
Recent efforts devoted to combining network structure and sequence features by graph neural networks and recurrent neural networks.
We propose a novel method CCasGNN considering the individual profile, structural features, and sequence information.
arXiv Detail & Related papers (2021-12-07T11:37:36Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Recurrent Graph Neural Network Algorithm for Unsupervised Network
Community Detection [0.0]
This paper proposes a new variant of the recurrent graph neural network algorithm for unsupervised network community detection through modularity optimization.
The new algorithm's performance is compared against a popular and fast Louvain method and a more efficient but slower Combo algorithm recently proposed by the author.
arXiv Detail & Related papers (2021-03-03T16:50:50Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z) - A Differential Game Theoretic Neural Optimizer for Training Residual
Networks [29.82841891919951]
We propose a generalized Differential Dynamic Programming (DDP) neural architecture that accepts both residual connections and convolution layers.
The resulting optimal control representation admits a gameoretic perspective, in which training residual networks can be interpreted as cooperative trajectory optimization on state-augmented systems.
arXiv Detail & Related papers (2020-07-17T10:19:17Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Backprojection for Training Feedforward Neural Networks in the Input and
Feature Spaces [12.323996999894002]
We propose a new algorithm for training feedforward neural networks which is fairly faster than backpropagation.
The proposed algorithm can be used for both input and feature spaces, named as backprojection and kernel backprojection, respectively.
arXiv Detail & Related papers (2020-04-05T20:53:11Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.