Fast and Informative Model Selection using Learning Curve
Cross-Validation
- URL: http://arxiv.org/abs/2111.13914v1
- Date: Sat, 27 Nov 2021 14:48:52 GMT
- Title: Fast and Informative Model Selection using Learning Curve
Cross-Validation
- Authors: Felix Mohr, Jan N. van Rijn
- Abstract summary: Cross-validation methods can be unnecessarily slow on large datasets.
We present a new approach for validation based on learning curves (LCCV)
LCCV iteratively increases the number of instances used for training.
- Score: 2.28438857884398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Common cross-validation (CV) methods like k-fold cross-validation or
Monte-Carlo cross-validation estimate the predictive performance of a learner
by repeatedly training it on a large portion of the given data and testing on
the remaining data. These techniques have two major drawbacks. First, they can
be unnecessarily slow on large datasets. Second, beyond an estimation of the
final performance, they give almost no insights into the learning process of
the validated algorithm. In this paper, we present a new approach for
validation based on learning curves (LCCV). Instead of creating train-test
splits with a large portion of training data, LCCV iteratively increases the
number of instances used for training. In the context of model selection, it
discards models that are very unlikely to become competitive. We run a large
scale experiment on the 67 datasets from the AutoML benchmark and empirically
show that in over 90% of the cases using LCCV leads to similar performance (at
most 1.5% difference) as using 5/10-fold CV. However, it yields substantial
runtime reductions of over 20% on average. Additionally, it provides important
insights, which for example allow assessing the benefits of acquiring more
data. These results are orthogonal to other advances in the field of AutoML.
Related papers
- Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z) - Efficient Classification with Counterfactual Reasoning and Active
Learning [4.708737212700907]
Methods called CCRAL combine causal reasoning to learn counterfactual samples for the original training samples and active learning to select useful counterfactual samples based on a region of uncertainty.
Experiments show that CCRAL achieves significantly better performance than those of the baselines in terms of accuracy and AUC.
arXiv Detail & Related papers (2022-07-25T12:03:40Z) - Learn by Challenging Yourself: Contrastive Visual Representation
Learning with Hard Sample Generation [16.3860181959878]
We propose a framework with two approaches to improve the data efficiency of Contrastive Learning (CL) training.
The first approach generates hard samples for the main model.
The generator is jointly learned with the main model to dynamically customize hard samples.
In joint learning, the hardness of a positive pair is progressively increased by decreasing their similarity.
arXiv Detail & Related papers (2022-02-14T02:41:43Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z) - Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper.
Our method makes use of information from both intra- and inter-images.
It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z) - Leave Zero Out: Towards a No-Cross-Validation Approach for Model
Selection [21.06860861548758]
Cross Validation (CV) is the main workhorse for model selection.
CV suffers a conservatively biased estimation, since some part of the limited data has to hold out for validation.
CV tends to be extremely cumbersome, e.g., intolerant time-consuming, due to the repeated training procedures.
arXiv Detail & Related papers (2020-12-24T16:11:53Z) - Approximate Cross-Validation for Structured Models [20.79997929155929]
Gold standard evaluation technique is structured cross-validation (CV)
But CV here can be prohibitively slow due to the need to re-run already-expensive learning algorithms many times.
Previous work has shown approximate cross-validation (ACV) methods provide a fast and provably accurate alternative.
arXiv Detail & Related papers (2020-06-23T00:06:03Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.