Training Deep Networks from Zero to Hero: avoiding pitfalls and going
beyond
- URL: http://arxiv.org/abs/2109.02752v1
- Date: Mon, 6 Sep 2021 21:31:42 GMT
- Title: Training Deep Networks from Zero to Hero: avoiding pitfalls and going
beyond
- Authors: Moacir Antonelli Ponti, Fernando Pereira dos Santos, Leo Sampaio
Ferraz Ribeiro, and Gabriel Biscaro Cavallari
- Abstract summary: This tutorial covers the basic steps as well as more recent options to improve models.
It can be particularly useful in datasets that are not as well-prepared as those in challenges.
- Score: 59.94347858883343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep neural networks may be challenging in real world data. Using
models as black-boxes, even with transfer learning, can result in poor
generalization or inconclusive results when it comes to small datasets or
specific applications. This tutorial covers the basic steps as well as more
recent options to improve models, in particular, but not restricted to,
supervised learning. It can be particularly useful in datasets that are not as
well-prepared as those in challenges, and also under scarce annotation and/or
small data. We describe basic procedures: as data preparation, optimization and
transfer learning, but also recent architectural choices such as use of
transformer modules, alternative convolutional layers, activation functions,
wide and deep networks, as well as training procedures including as curriculum,
contrastive and self-supervised learning.
Related papers
- Robust Machine Learning by Transforming and Augmenting Imperfect
Training Data [6.928276018602774]
This thesis explores several data sensitivities of modern machine learning.
We first discuss how to prevent ML from codifying prior human discrimination measured in the training data.
We then discuss the problem of learning from data containing spurious features, which provide predictive fidelity during training but are unreliable upon deployment.
arXiv Detail & Related papers (2023-12-19T20:49:28Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - netFound: Foundation Model for Network Security [11.38388749887112]
This paper introduces a novel transformer-based network foundation model, netFound.
We employ self-supervised learning techniques on abundant, unlabeled network telemetry data for pre-training.
Our results demonstrate that netFound effectively captures the hidden networking context in production settings.
arXiv Detail & Related papers (2023-10-25T22:04:57Z) - Learn, Unlearn and Relearn: An Online Learning Paradigm for Deep Neural
Networks [12.525959293825318]
We introduce Learn, Unlearn, and Relearn (LURE) an online learning paradigm for deep neural networks (DNNs)
LURE interchanges between the unlearning phase, which selectively forgets the undesirable information in the model, and the relearning phase, which emphasizes learning on generalizable features.
We show that our training paradigm provides consistent performance gains across datasets in both classification and few-shot settings.
arXiv Detail & Related papers (2023-03-18T16:45:54Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - Friendly Training: Neural Networks Can Adapt Data To Make Learning
Easier [23.886422706697882]
We propose a novel training procedure named Friendly Training.
We show that Friendly Training yields improvements with respect to informed data sub-selection and random selection.
Results suggest that adapting the input data is a feasible way to stabilize learning and improve the skills generalization of the network.
arXiv Detail & Related papers (2021-06-21T10:50:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.