Data-driven Regularization via Racecar Training for Generalizing Neural
Networks
- URL: http://arxiv.org/abs/2007.00024v1
- Date: Tue, 30 Jun 2020 18:00:41 GMT
- Title: Data-driven Regularization via Racecar Training for Generalizing Neural
Networks
- Authors: You Xie, Nils Thuerey
- Abstract summary: We propose a novel training approach for improving the generalization in neural networks.
We show how our formulation is easy to realize in practical network architectures via a reverse pass.
Networks trained with our approach show more balanced mutual information between input and output throughout all layers, yield improved explainability and, exhibit improved performance for a variety of tasks and task transfers.
- Score: 28.08782668165276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel training approach for improving the generalization in
neural networks. We show that in contrast to regular constraints for
orthogonality, our approach represents a {\em data-dependent} orthogonality
constraint, and is closely related to singular value decompositions of the
weight matrices. We also show how our formulation is easy to realize in
practical network architectures via a reverse pass, which aims for
reconstructing the full sequence of internal states of the network. Despite
being a surprisingly simple change, we demonstrate that this forward-backward
training approach, which we refer to as {\em racecar} training, leads to
significantly more generic features being extracted from a given data set.
Networks trained with our approach show more balanced mutual information
between input and output throughout all layers, yield improved explainability
and, exhibit improved performance for a variety of tasks and task transfers.
Related papers
- Generalization emerges from local optimization in a self-organized learning network [0.0]
We design and analyze a new paradigm for building supervised learning networks, driven only by local optimization rules without relying on a global error function.
Our network stores new knowledge in the nodes accurately and instantaneously, in the form of a lookup table.
We show on numerous examples of classification tasks that the networks generated by our algorithm systematically reach such a state of perfect generalization when the number of learned examples becomes sufficiently large.
We report on the dynamics of the change of state and show that it is abrupt and has the distinctive characteristics of a first order phase transition, a phenomenon already observed for traditional learning networks and known as grokking.
arXiv Detail & Related papers (2024-10-03T15:32:08Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Continual Learning with Invertible Generative Models [15.705568893476947]
Catastrophic forgetting (CF) happens whenever a neural network overwrites past knowledge while being trained on new tasks.
We propose a novel method that combines the strengths of regularization and generative-based rehearsal approaches.
arXiv Detail & Related papers (2022-02-11T15:28:30Z) - Being Friends Instead of Adversaries: Deep Networks Learn from Data
Simplified by Other Networks [23.886422706697882]
A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation.
We revisit and extend this idea inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning.
We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier.
arXiv Detail & Related papers (2021-12-18T16:59:35Z) - Dense Unsupervised Learning for Video Segmentation [49.46930315961636]
We present a novel approach to unsupervised learning for video object segmentation (VOS)
Unlike previous work, our formulation allows to learn dense feature representations directly in a fully convolutional regime.
Our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.
arXiv Detail & Related papers (2021-11-11T15:15:11Z) - Transfer Learning for Node Regression Applied to Spreading Prediction [0.0]
We explore the utility of the state-of-the-art node representation learners when used to assess the effects of spreading from a given node.
As many real-life networks are topologically similar, we systematically investigate whether the learned models generalize to previously unseen networks.
This is one of the first attempts to evaluate the utility of zero-shot transfer for the task of node regression.
arXiv Detail & Related papers (2021-03-31T20:09:09Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - Graph-Based Neural Network Models with Multiple Self-Supervised
Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points.
We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z) - Adversarial Training Reduces Information and Improves Transferability [81.59364510580738]
Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.
We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new trade-off between transferability of representations and accuracy on the source task.
arXiv Detail & Related papers (2020-07-22T08:30:16Z) - Unbiased Deep Reinforcement Learning: A General Training Framework for
Existing and Future Algorithms [3.7050607140679026]
We propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning.
We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences.
We propose several algorithms embedded with our new framework to deal with typical discrete and continuous scenarios.
arXiv Detail & Related papers (2020-05-12T01:51:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.