Multi-Stage Transfer Learning with an Application to Selection Process
- URL: http://arxiv.org/abs/2006.01276v1
- Date: Mon, 1 Jun 2020 21:27:04 GMT
- Title: Multi-Stage Transfer Learning with an Application to Selection Process
- Authors: Andre Mendes, Julian Togelius, Leandro dos Santos Coelho
- Abstract summary: In multi-stage processes, decisions happen in an ordered sequence of stages.
In this work, we proposed a textitMulti-StaGe Transfer Learning (MSGTL) approach that uses knowledge from simple classifiers trained in early stages.
We show that it is possible to control the trade-off between conserving knowledge and fine-tuning using a simple probabilistic map.
- Score: 5.933303832684138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-stage processes, decisions happen in an ordered sequence of stages.
Many of them have the structure of dual funnel problem: as the sample size
decreases from one stage to the other, the information increases. A related
example is a selection process, where applicants apply for a position, prize,
or grant. In each stage, more applicants are evaluated and filtered out, and
from the remaining ones, more information is collected. In the last stage,
decision-makers use all available information to make their final decision. To
train a classifier for each stage becomes impracticable as they can underfit
due to the low dimensionality in early stages or overfit due to the small
sample size in the latter stages. In this work, we proposed a
\textit{Multi-StaGe Transfer Learning} (MSGTL) approach that uses knowledge
from simple classifiers trained in early stages to improve the performance of
classifiers in the latter stages. By transferring weights from simpler neural
networks trained in larger datasets, we able to fine-tune more complex neural
networks in the latter stages without overfitting due to the small sample size.
We show that it is possible to control the trade-off between conserving
knowledge and fine-tuning using a simple probabilistic map. Experiments using
real-world data demonstrate the efficacy of our approach as it outperforms
other state-of-the-art methods for transfer learning and regularization.
Related papers
- RanDumb: A Simple Approach that Questions the Efficacy of Continual Representation Learning [68.42776779425978]
We show that existing online continually trained deep networks produce inferior representations compared to a simple pre-defined random transforms.
We then train a simple linear classifier on top without storing any exemplars, processing one sample at a time in an online continual learning setting.
Our study reveals the significant limitations of representation learning, particularly in low-exemplar and online continual learning scenarios.
arXiv Detail & Related papers (2024-02-13T22:07:29Z) - Fast Detection of Phase Transitions with Multi-Task
Learning-by-Confusion [0.0]
One of the most popular approaches to identifying critical points from data without prior knowledge of the underlying phases is the learning-by-confusion scheme.
Up to now, the scheme required training a distinct binary classifier for each possible splitting of the grid into two sides, resulting in a computational cost that scales linearly with the number of grid points.
In this work, we propose and showcase an alternative implementation that only requires the training of a single multi-class classifier.
arXiv Detail & Related papers (2023-11-15T17:17:49Z) - Automated Imbalanced Classification via Layered Learning [0.734084539365505]
Applying resampling strategies to balance the class distribution of training instances is a common approach to tackle these problems.
Many state-of-the-art methods find instances of interest close to the decision boundary to drive the resampling process.
Over-sampling may increase the chance of overfitting by propagating the information contained in instances from the minority class.
arXiv Detail & Related papers (2022-05-05T10:32:24Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets.
We propose a novel general and efficient active learning (GEAL) method in this paper.
Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z) - A Flexible Selection Scheme for Minimum-Effort Transfer Learning [27.920304852537534]
Fine-tuning is a popular way of exploiting knowledge contained in a pre-trained convolutional network for a new visual recognition task.
We introduce a new form of fine-tuning, called flex-tuning, in which any individual unit of a network can be tuned.
We show that fine-tuning individual units, despite its simplicity, yields very good results as an adaptation technique.
arXiv Detail & Related papers (2020-08-27T08:57:30Z) - Learning to Sample with Local and Global Contexts in Experience Replay
Buffer [135.94190624087355]
We propose a new learning-based sampling method that can compute the relative importance of transition.
We show that our framework can significantly improve the performance of various off-policy reinforcement learning methods.
arXiv Detail & Related papers (2020-07-14T21:12:56Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z) - Self-supervised Knowledge Distillation for Few-shot Learning [123.10294801296926]
Few shot learning is a promising learning paradigm due to its ability to learn out of order distributions quickly with only a few samples.
We propose a simple approach to improve the representation capacity of deep neural networks for few-shot learning tasks.
Our experiments show that, even in the first stage, self-supervision can outperform current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T11:27:00Z) - Adversarial Encoder-Multi-Task-Decoder for Multi-Stage Processes [5.933303832684138]
In multi-stage processes, decisions occur in an ordered sequence of stages.
We introduce a framework that combines adversarial autoencoders (AAE), multi-task learning (MTL), and multi-label semi-supervised learning (MLSSL)
Using real-world data from different domains, we show that our approach outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-03-15T19:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.