Stimulative Training of Residual Networks: A Social Psychology
Perspective of Loafing
- URL: http://arxiv.org/abs/2210.04153v1
- Date: Sun, 9 Oct 2022 03:15:51 GMT
- Title: Stimulative Training of Residual Networks: A Social Psychology
Perspective of Loafing
- Authors: Peng Ye, Shengji Tang, Baopu Li, Tao Chen, Wanli Ouyang
- Abstract summary: Residual networks have shown great success and become indispensable in today's deep models.
We aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing.
We propose a new training strategy to strengthen the performance of residual networks.
- Score: 86.69698062642055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Residual networks have shown great success and become indispensable in
today's deep models. In this work, we aim to re-investigate the training
process of residual networks from a novel social psychology perspective of
loafing, and further propose a new training strategy to strengthen the
performance of residual networks. As residual networks can be viewed as
ensembles of relatively shallow networks (i.e., \textit{unraveled view}) in
prior works, we also start from such view and consider that the final
performance of a residual network is co-determined by a group of sub-networks.
Inspired by the social loafing problem of social psychology, we find that
residual networks invariably suffer from similar problem, where sub-networks in
a residual network are prone to exert less effort when working as part of the
group compared to working alone. We define this previously overlooked problem
as \textit{network loafing}. As social loafing will ultimately cause the low
individual productivity and the reduced overall performance, network loafing
will also hinder the performance of a given residual network and its
sub-networks. Referring to the solutions of social psychology, we propose
\textit{stimulative training}, which randomly samples a residual sub-network
and calculates the KL-divergence loss between the sampled sub-network and the
given residual network, to act as extra supervision for sub-networks and make
the overall goal consistent. Comprehensive empirical results and theoretical
analyses verify that stimulative training can well handle the loafing problem,
and improve the performance of a residual network by improving the performance
of its sub-networks. The code is available at
https://github.com/Sunshine-Ye/NIPS22-ST .
Related papers
- Stimulative Training++: Go Beyond The Performance Limits of Residual
Networks [91.5381301894899]
Residual networks have shown great success and become indispensable in recent deep neural network models.
Previous research has suggested that residual networks can be considered as ensembles of shallow networks.
We identify a problem that is analogous to social loafing, whereworks within a residual network are prone to exert less effort when working as part of a group compared to working alone.
arXiv Detail & Related papers (2023-05-04T02:38:11Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Residual networks classify inputs based on their neural transient
dynamics [0.0]
We show analytically that there is a cooperation and competition dynamics between residuals corresponding to each input dimension.
In cases where residuals do not converge to an attractor state, their internal dynamics are separable for each input class, and the network can reliably approximate the output.
arXiv Detail & Related papers (2021-01-08T13:54:37Z) - Provably Training Neural Network Classifiers under Fairness Constraints [70.64045590577318]
We show that overparametrized neural networks could meet the constraints.
Key ingredient of building a fair neural network classifier is establishing no-regret analysis for neural networks.
arXiv Detail & Related papers (2020-12-30T18:46:50Z) - NetReAct: Interactive Learning for Network Summarization [60.18513812680714]
We present NetReAct, a novel interactive network summarization algorithm which supports the visualization of networks induced by text corpora to perform sensemaking.
We show how NetReAct is successful in generating high-quality summaries and visualizations that reveal hidden patterns better than other non-trivial baselines.
arXiv Detail & Related papers (2020-12-22T03:56:26Z) - Activation function impact on Sparse Neural Networks [0.0]
Sparse Evolutionary Training allows for significantly lower computational complexity when compared to fully connected models.
This research provides insights into the relationship between the activation function used and the network performance at various sparsity levels.
arXiv Detail & Related papers (2020-10-12T18:05:04Z) - A Principle of Least Action for the Training of Neural Networks [10.342408668490975]
We show the presence of a low kinetic energy displacement bias in the transport map of the network, and link this bias with generalization performance.
We propose a new learning algorithm, which automatically adapts to the complexity of the given task, and leads to networks with a high generalization ability even in low data regimes.
arXiv Detail & Related papers (2020-09-17T15:37:34Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable
Optimization Via Overparameterization From Depth [19.866928507243617]
Training deep neural networks with gradient descent (SGD) can often achieve zero training loss on real-world landscapes.
We propose a new limit of infinity deep residual networks, which enjoys a good training in the sense that everyr is global.
arXiv Detail & Related papers (2020-03-11T20:14:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.