Zero-Shot AutoML with Pretrained Models
- URL: http://arxiv.org/abs/2206.08476v1
- Date: Thu, 16 Jun 2022 22:52:08 GMT
- Title: Zero-Shot AutoML with Pretrained Models
- Authors: Ekrem \"Ozt\"urk and Fabio Ferreira and Hadi S. Jomaa and Lars
Schmidt-Thieme and Josif Grabocka and Frank Hutter
- Abstract summary: domain-independent meta-learning approach learns a zero-shot surrogate model.
We evaluate our approach under the strict time limit of the ChaLearn AutoDL challenge benchmark.
- Score: 39.928531675761135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a new dataset D and a low compute budget, how should we choose a
pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters
without risking overfitting, particularly if D is small? Here, we extend
automated machine learning (AutoML) to best make these choices. Our
domain-independent meta-learning approach learns a zero-shot surrogate model
which, at test time, allows to select the right deep learning (DL) pipeline
(including the pre-trained model and fine-tuning hyperparameters) for a new
dataset D given only trivial meta-features describing D such as image
resolution or the number of classes. To train this zero-shot model, we collect
performance data for many DL pipelines on a large collection of datasets and
meta-train on this data to minimize a pairwise ranking objective. We evaluate
our approach under the strict time limit of the vision track of the ChaLearn
AutoDL challenge benchmark, clearly outperforming all challenge contenders.
Related papers
- Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA [15.542668474378633]
We propose a novel and efficient machine unlearning method on pre-trained models.
We leverage LoRA to decompose the model's intermediate features into pre-trained features and residual features.
The method aims to learn the zero residuals on the retained set and shifted residuals on the unlearning set.
arXiv Detail & Related papers (2024-11-13T08:56:35Z) - Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - AutoXPCR: Automated Multi-Objective Model Selection for Time Series
Forecasting [1.0515439489916734]
We propose AutoXPCR - a novel method for automated and explainable multi-objective model selection.
Our approach leverages meta-learning to estimate any model's performance along PCR criteria, which encompass (P)redictive error, (C)omplexity, and (R)esource demand.
Our method clearly outperforms other model selection approaches - on average, it only requires 20% of computation costs for recommending models with 90% of the best-possible quality.
arXiv Detail & Related papers (2023-12-20T14:04:57Z) - On minimizing the training set fill distance in machine learning regression [0.552480439325792]
We study a data selection approach that aims to minimize the fill distance of the selected set.
We show that selecting training sets with the FPS can also increase model stability for the specific case of Gaussian kernel regression approaches.
arXiv Detail & Related papers (2023-07-20T16:18:33Z) - Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How [62.467716468917224]
We propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it.
Our method transfers knowledge about the performance of many pretrained models on a series of datasets.
We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset.
arXiv Detail & Related papers (2023-06-06T16:15:26Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - Training Data Subset Selection for Regression with Controlled
Generalization Error [19.21682938684508]
We develop an efficient majorization-minimization algorithm for data subset selection.
SELCON trades off accuracy and efficiency more effectively than the current state-of-the-art.
arXiv Detail & Related papers (2021-06-23T16:03:55Z) - Few-Shot Lifelong Learning [35.05196800623617]
Few-Shot Lifelong Learning enables deep learning models to perform lifelong/continual learning on few-shot data.
Our method selects very few parameters from the model for training every new set of classes instead of training the full model.
We experimentally show that our method significantly outperforms existing methods on the miniImageNet, CIFAR-100, and CUB-200 datasets.
arXiv Detail & Related papers (2021-03-01T13:26:57Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.