Related papers: Zero-Shot AutoML with Pretrained Models

Zero-Shot AutoML with Pretrained Models

URL: http://arxiv.org/abs/2206.08476v1
Date: Thu, 16 Jun 2022 22:52:08 GMT
Title: Zero-Shot AutoML with Pretrained Models
Authors: Ekrem \"Ozt\"urk and Fabio Ferreira and Hadi S. Jomaa and Lars Schmidt-Thieme and Josif Grabocka and Frank Hutter
Abstract summary: domain-independent meta-learning approach learns a zero-shot surrogate model. We evaluate our approach under the strict time limit of the ChaLearn AutoDL challenge benchmark.
Score: 39.928531675761135
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to select the right deep learning (DL) pipeline (including the pre-trained model and fine-tuning hyperparameters) for a new dataset D given only trivial meta-features describing D such as image resolution or the number of classes. To train this zero-shot model, we collect performance data for many DL pipelines on a large collection of datasets and meta-train on this data to minimize a pairwise ranking objective. We evaluate our approach under the strict time limit of the vision track of the ChaLearn AutoDL challenge benchmark, clearly outperforming all challenge contenders.

Related papers

Meta-Learning and Synthetic Data for Automated Pretraining and Finetuning [2.657867981416885]
Growing number of pretrained models in Machine Learning (ML) presents significant challenges for practitioners.<n>As models grow in scale, the increasing reliance on real-world data poses a bottleneck for training and requires leveraging data more effectively.<n>This dissertation adopts meta-learning to extend automated machine learning to the deep learning domain.
arXiv Detail & Related papers (2025-06-11T12:48:45Z)
Sharpness-Aware Parameter Selection for Machine Unlearning [6.397490580631141]
It often happens that some sensitive personal information, such as credit card numbers or passwords, are mistakenly incorporated in the training of machine learning models and need to be removed afterwards. There have been various machine unlearning techniques proposed in the literature to address this problem. Most of the proposed methods revolve around removing individual data samples from a trained model. While the existing methods for these tasks do the unlearning task by updating the whole set of model parameters or only the last layer of the model, we show that there are a subset of model parameters that have the largest contribution in the unlearning target features.
arXiv Detail & Related papers (2025-04-08T19:41:07Z)
Active Learning from Scene Embeddings for End-to-End Autonomous Driving [30.667451458189902]
Training end-to-end deep learning models requires large amounts of labeled data. We propose an active learning framework that relies on vectorized scene-level features, called SEAD. Experiments show that we only need 30% of the nuScenes training data to achieve performance close to what can be achieved with the full dataset.
arXiv Detail & Related papers (2025-03-14T03:56:22Z)
Machine Unlearning on Pre-trained Models by Residual Feature Alignment Using LoRA [15.542668474378633]
We propose a novel and efficient machine unlearning method on pre-trained models. We leverage LoRA to decompose the model's intermediate features into pre-trained features and residual features. The method aims to learn the zero residuals on the retained set and shifted residuals on the unlearning set.
arXiv Detail & Related papers (2024-11-13T08:56:35Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
AutoXPCR: Automated Multi-Objective Model Selection for Time Series Forecasting [1.0515439489916734]
We propose AutoXPCR - a novel method for automated and explainable multi-objective model selection. Our approach leverages meta-learning to estimate any model's performance along PCR criteria, which encompass (P)redictive error, (C)omplexity, and (R)esource demand. Our method clearly outperforms other model selection approaches - on average, it only requires 20% of computation costs for recommending models with 90% of the best-possible quality.
arXiv Detail & Related papers (2023-12-20T14:04:57Z)
On minimizing the training set fill distance in machine learning regression [0.552480439325792]
We study a data selection approach that aims to minimize the fill distance of the selected set. We show that selecting training sets with the FPS can also increase model stability for the specific case of Gaussian kernel regression approaches.
arXiv Detail & Related papers (2023-07-20T16:18:33Z)
Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How [62.467716468917224]
We propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it. Our method transfers knowledge about the performance of many pretrained models on a series of datasets. We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset.
arXiv Detail & Related papers (2023-06-06T16:15:26Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)
Training Data Subset Selection for Regression with Controlled Generalization Error [19.21682938684508]
We develop an efficient majorization-minimization algorithm for data subset selection. SELCON trades off accuracy and efficiency more effectively than the current state-of-the-art.
arXiv Detail & Related papers (2021-06-23T16:03:55Z)
Few-Shot Lifelong Learning [35.05196800623617]
Few-Shot Lifelong Learning enables deep learning models to perform lifelong/continual learning on few-shot data. Our method selects very few parameters from the model for training every new set of classes instead of training the full model. We experimentally show that our method significantly outperforms existing methods on the miniImageNet, CIFAR-100, and CUB-200 datasets.
arXiv Detail & Related papers (2021-03-01T13:26:57Z)
Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim. We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting. Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.