Data-driven Regularization via Racecar Training for Generalizing Neural
Networks
- URL: http://arxiv.org/abs/2007.00024v1
- Date: Tue, 30 Jun 2020 18:00:41 GMT
- Title: Data-driven Regularization via Racecar Training for Generalizing Neural
Networks
- Authors: You Xie, Nils Thuerey
- Abstract summary: We propose a novel training approach for improving the generalization in neural networks.
We show how our formulation is easy to realize in practical network architectures via a reverse pass.
Networks trained with our approach show more balanced mutual information between input and output throughout all layers, yield improved explainability and, exhibit improved performance for a variety of tasks and task transfers.
- Score: 28.08782668165276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel training approach for improving the generalization in
neural networks. We show that in contrast to regular constraints for
orthogonality, our approach represents a {\em data-dependent} orthogonality
constraint, and is closely related to singular value decompositions of the
weight matrices. We also show how our formulation is easy to realize in
practical network architectures via a reverse pass, which aims for
reconstructing the full sequence of internal states of the network. Despite
being a surprisingly simple change, we demonstrate that this forward-backward
training approach, which we refer to as {\em racecar} training, leads to
significantly more generic features being extracted from a given data set.
Networks trained with our approach show more balanced mutual information
between input and output throughout all layers, yield improved explainability
and, exhibit improved performance for a variety of tasks and task transfers.
Related papers
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization [78.61621802973262]
We introduce an Orthogonal finetuning method for efficiently updating pretrained weights.
A cross-regularization strategy is also exploited to maintain the stability in terms of zero-shot generalization.
We conduct extensive experiments to demonstrate that our method explicitly steers pretrained weight space to represent the task-specific knowledge.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Network Alignment with Transferable Graph Autoencoders [79.89704126746204]
We propose a novel graph autoencoder architecture designed to extract powerful and robust node embeddings.
We prove that the generated embeddings are associated with the eigenvalues and eigenvectors of the graphs.
Our proposed framework also leverages transfer learning and data augmentation to achieve efficient network alignment at a very large scale without retraining.
arXiv Detail & Related papers (2023-10-05T02:58:29Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Continual Learning with Invertible Generative Models [15.705568893476947]
Catastrophic forgetting (CF) happens whenever a neural network overwrites past knowledge while being trained on new tasks.
We propose a novel method that combines the strengths of regularization and generative-based rehearsal approaches.
arXiv Detail & Related papers (2022-02-11T15:28:30Z) - Being Friends Instead of Adversaries: Deep Networks Learn from Data
Simplified by Other Networks [23.886422706697882]
A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation.
We revisit and extend this idea inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning.
We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier.
arXiv Detail & Related papers (2021-12-18T16:59:35Z) - Transfer Learning for Node Regression Applied to Spreading Prediction [0.0]
We explore the utility of the state-of-the-art node representation learners when used to assess the effects of spreading from a given node.
As many real-life networks are topologically similar, we systematically investigate whether the learned models generalize to previously unseen networks.
This is one of the first attempts to evaluate the utility of zero-shot transfer for the task of node regression.
arXiv Detail & Related papers (2021-03-31T20:09:09Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - Adversarial Training Reduces Information and Improves Transferability [81.59364510580738]
Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.
We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new trade-off between transferability of representations and accuracy on the source task.
arXiv Detail & Related papers (2020-07-22T08:30:16Z) - Unbiased Deep Reinforcement Learning: A General Training Framework for
Existing and Future Algorithms [3.7050607140679026]
We propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning.
We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences.
We propose several algorithms embedded with our new framework to deal with typical discrete and continuous scenarios.
arXiv Detail & Related papers (2020-05-12T01:51:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.