Related papers: Fast and Informative Model Selection using Learning Curve Cross-Validation

Fast and Informative Model Selection using Learning Curve Cross-Validation

URL: http://arxiv.org/abs/2111.13914v1
Date: Sat, 27 Nov 2021 14:48:52 GMT
Title: Fast and Informative Model Selection using Learning Curve Cross-Validation
Authors: Felix Mohr, Jan N. van Rijn
Abstract summary: Cross-validation methods can be unnecessarily slow on large datasets. We present a new approach for validation based on learning curves (LCCV) LCCV iteratively increases the number of instances used for training.
Score: 2.28438857884398
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Common cross-validation (CV) methods like k-fold cross-validation or Monte-Carlo cross-validation estimate the predictive performance of a learner by repeatedly training it on a large portion of the given data and testing on the remaining data. These techniques have two major drawbacks. First, they can be unnecessarily slow on large datasets. Second, beyond an estimation of the final performance, they give almost no insights into the learning process of the validated algorithm. In this paper, we present a new approach for validation based on learning curves (LCCV). Instead of creating train-test splits with a large portion of training data, LCCV iteratively increases the number of instances used for training. In the context of model selection, it discards models that are very unlikely to become competitive. We run a large scale experiment on the 67 datasets from the AutoML benchmark and empirically show that in over 90% of the cases using LCCV leads to similar performance (at most 1.5% difference) as using 5/10-fold CV. However, it yields substantial runtime reductions of over 20% on average. Additionally, it provides important insights, which for example allow assessing the benefits of acquiring more data. These results are orthogonal to other advances in the field of AutoML.

Related papers

SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models [51.74498855100541]
Large language models (LLMs) have shown strong reasoning capabilities when fine-tuned with reinforcement learning (RL)<n>We propose textbfSPaRFT, a self-paced learning framework that enables efficient learning based on the capability of the model being trained.
arXiv Detail & Related papers (2025-08-07T03:50:48Z)
Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences. We formulate each task as a sequence-to-sequence problem and perform multi-task training. We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z)
RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones. We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z)
TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z)
Efficient Classification with Counterfactual Reasoning and Active Learning [4.708737212700907]
Methods called CCRAL combine causal reasoning to learn counterfactual samples for the original training samples and active learning to select useful counterfactual samples based on a region of uncertainty. Experiments show that CCRAL achieves significantly better performance than those of the baselines in terms of accuracy and AUC.
arXiv Detail & Related papers (2022-07-25T12:03:40Z)
Learn by Challenging Yourself: Contrastive Visual Representation Learning with Hard Sample Generation [16.3860181959878]
We propose a framework with two approaches to improve the data efficiency of Contrastive Learning (CL) training. The first approach generates hard samples for the main model. The generator is jointly learned with the main model to dynamically customize hard samples. In joint learning, the hardness of a positive pair is progressively increased by decreasing their similarity.
arXiv Detail & Related papers (2022-02-14T02:41:43Z)
Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios. For dataset bias due to different samplers, we propose shifted batch normalization. Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
Jigsaw Clustering for Unsupervised Visual Representation Learning [68.09280490213399]
We propose a new jigsaw clustering pretext task in this paper. Our method makes use of information from both intra- and inter-images. It is even comparable to the contrastive learning methods when only half of training batches are used.
arXiv Detail & Related papers (2021-04-01T08:09:26Z)
Leave Zero Out: Towards a No-Cross-Validation Approach for Model Selection [21.06860861548758]
Cross Validation (CV) is the main workhorse for model selection. CV suffers a conservatively biased estimation, since some part of the limited data has to hold out for validation. CV tends to be extremely cumbersome, e.g., intolerant time-consuming, due to the repeated training procedures.
arXiv Detail & Related papers (2020-12-24T16:11:53Z)
Approximate Cross-Validation for Structured Models [20.79997929155929]
Gold standard evaluation technique is structured cross-validation (CV) But CV here can be prohibitively slow due to the need to re-run already-expensive learning algorithms many times. Previous work has shown approximate cross-validation (ACV) methods provide a fast and provably accurate alternative.
arXiv Detail & Related papers (2020-06-23T00:06:03Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.